Skip to content

Conversation

@sunflower2333
Copy link
Owner

No description provided.

kuba-moo and others added 30 commits December 10, 2025 01:15
nf_conntrack_cleanup_net_list() calls schedule() so it does not
show up as a hung task. Add an explicit check to make debugging
leaked skbs/conntack references more obvious.

Acked-by: Florian Westphal <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
Jakub Kicinski says:

====================
inet: frags: flush pending skbs in fqdir_pre_exit()

Fix the issue reported by NIPA starting on Sep 18th [1], where
pernet_ops_rwsem is constantly held by a reader, preventing writers
from grabbing it (specifically driver modules from loading).

The fact that reports started around that time seems coincidental.
The issue seems to be skbs queued for defrag preventing conntrack
from exiting.

First patch fixes another theoretical issue, it's mostly a leftover
from an attempt to get rid of the inet_frag_queue refcnt, which
I gave up on (still think it's doable but a bit of a time sink).
Second patch is a minor refactor.

The real fix is in the third patch. It's the simplest fix I can
think of which is to flush the frag queues. Perhaps someone has
a better suggestion?

Last patch adds an explicit warning for conntrack getting stuck,
as this seems like something that can easily happen if bugs sneak in.
The warning will hopefully save us the first 20% of the investigation
effort.

Link: https://lore.kernel.org/[email protected] # [1]
====================

Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
Commit 37cce22 ("bpf: verifier: Refactor helper access type
tracking") started distinguishing read vs write accesses performed by
helpers.

The second argument of bpf_d_path() is a pointer to a buffer that the
helper fills with the resulting path. However, its prototype currently
uses ARG_PTR_TO_MEM without MEM_WRITE.

Before 37cce22, helper accesses were conservatively treated as
potential writes, so this mismatch did not cause issues. Since that
commit, the verifier may incorrectly assume that the buffer contents
are unchanged across the helper call and base its optimizations on this
wrong assumption. This can lead to misbehaviour in BPF programs that
read back the buffer, such as prefix comparisons on the returned path.

Fix this by marking the second argument of bpf_d_path() as
ARG_PTR_TO_MEM | MEM_WRITE so that the verifier correctly models the
write to the caller-provided buffer.

Fixes: 37cce22 ("bpf: verifier: Refactor helper access type tracking")
Co-developed-by: Zesen Liu <[email protected]>
Signed-off-by: Zesen Liu <[email protected]>
Co-developed-by: Peili Gao <[email protected]>
Signed-off-by: Peili Gao <[email protected]>
Co-developed-by: Haoran Ni <[email protected]>
Signed-off-by: Haoran Ni <[email protected]>
Signed-off-by: Shuran Liu <[email protected]>
Reviewed-by: Matt Bobrowski <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
Add a regression test for bpf_d_path() to cover incorrect verifier
assumptions caused by an incorrect function prototype. The test
attaches to the fallocate hook, calls bpf_d_path() and verifies that
a simple prefix comparison on the returned pathname behaves correctly
after the fix in patch 1. It ensures the verifier does not assume
the buffer remains unwritten.

Co-developed-by: Zesen Liu <[email protected]>
Signed-off-by: Zesen Liu <[email protected]>
Co-developed-by: Peili Gao <[email protected]>
Signed-off-by: Peili Gao <[email protected]>
Co-developed-by: Haoran Ni <[email protected]>
Signed-off-by: Haoran Ni <[email protected]>
Signed-off-by: Shuran Liu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
Shuran Liu says:

====================
bpf: fix bpf_d_path() helper prototype

Hi,

This series fixes a verifier issue with bpf_d_path() and adds a
regression test to cover its use within a hook function.

Patch 1 updates the bpf_d_path() helper prototype so that the second
argument is marked as MEM_WRITE. This makes it explicit to the verifier
that the helper writes into the provided buffer.

Patch 2 extends the existing d_path selftest to cover incorrect verifier
assumptions caused by an incorrect function prototype. The test program calls
bpf_d_path() and checks if the first character of the path can be read.
It ensures the verifier does not assume the buffer remains unwritten.

Changelog
=========

v5:
  - Moved the temporary file for the fallocate test from /tmp to /dev/shm
    Since bpf CI's 9P filesystem under /tmp does not support fallocate.

v4:
  - Use the fallocate hook instead of an LSM hook to simplify the selftest,
    as suggested by Matt and Alexei.
  - Add a utility function in test_d_path.c to load the BPF program,
    improving code reuse.

v3:
  - Switch the pathname prefix loop to use bpf_for() instead of
    #pragma unroll, as suggested by Matt.
  - Remove /tmp/bpf_d_path_test in the test cleanup path.
  - Add the missing Reviewed-by tags.

v2:
  - Merge the new test into the existing d_path selftest rather than
  creating new files.
  - Add PID filtering in the LSM program to avoid nondeterministic failures
  due to unrelated processes triggering bprm_check_security.
  - Synchronize child execution using a pipe to ensure deterministic
  updates to the PID.

Thanks for your time and reviews.
====================

Link: https://patch.msgid.link/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
There are some situations where ct might be leaked as error paths are
skipping the refcounted check and return immediately. In order to solve
it make sure that the check is always called.

Fixes: be102eb ("netfilter: nf_conncount: rework API to use sk_buff directly")
Signed-off-by: Fernando Fernandez Mancera <[email protected]>
Signed-off-by: Florian Westphal <[email protected]>
The IPv4 code path in __ip_vs_get_out_rt() calls dst_link_failure()
without ensuring skb->dev is set, leading to a NULL pointer dereference
in fib_compute_spec_dst() when ipv4_link_failure() attempts to send
ICMP destination unreachable messages.

The issue emerged after commit ed0de45 ("ipv4: recompile ip options
in ipv4_link_failure") started calling __ip_options_compile() from
ipv4_link_failure(). This code path eventually calls fib_compute_spec_dst()
which dereferences skb->dev. An attempt was made to fix the NULL skb->dev
dereference in commit 0113d9c ("ipv4: fix null-deref in
ipv4_link_failure"), but it only addressed the immediate dev_net(skb->dev)
dereference by using a fallback device. The fix was incomplete because
fib_compute_spec_dst() later in the call chain still accesses skb->dev
directly, which remains NULL when IPVS calls dst_link_failure().

The crash occurs when:
1. IPVS processes a packet in NAT mode with a misconfigured destination
2. Route lookup fails in __ip_vs_get_out_rt() before establishing a route
3. The error path calls dst_link_failure(skb) with skb->dev == NULL
4. ipv4_link_failure() → ipv4_send_dest_unreach() →
   __ip_options_compile() → fib_compute_spec_dst()
5. fib_compute_spec_dst() dereferences NULL skb->dev

Apply the same fix used for IPv6 in commit 326bf17 ("ipvs: fix
ipv6 route unreach panic"): set skb->dev from skb_dst(skb)->dev before
calling dst_link_failure().

KASAN: null-ptr-deref in range [0x0000000000000328-0x000000000000032f]
CPU: 1 PID: 12732 Comm: syz.1.3469 Not tainted 6.6.114 #2
RIP: 0010:__in_dev_get_rcu include/linux/inetdevice.h:233
RIP: 0010:fib_compute_spec_dst+0x17a/0x9f0 net/ipv4/fib_frontend.c:285
Call Trace:
  <TASK>
  spec_dst_fill net/ipv4/ip_options.c:232
  spec_dst_fill net/ipv4/ip_options.c:229
  __ip_options_compile+0x13a1/0x17d0 net/ipv4/ip_options.c:330
  ipv4_send_dest_unreach net/ipv4/route.c:1252
  ipv4_link_failure+0x702/0xb80 net/ipv4/route.c:1265
  dst_link_failure include/net/dst.h:437
  __ip_vs_get_out_rt+0x15fd/0x19e0 net/netfilter/ipvs/ip_vs_xmit.c:412
  ip_vs_nat_xmit+0x1d8/0xc80 net/netfilter/ipvs/ip_vs_xmit.c:764

Fixes: ed0de45 ("ipv4: recompile ip options in ipv4_link_failure")
Signed-off-by: Slavin Liu <[email protected]>
Acked-by: Julian Anastasov <[email protected]>
Signed-off-by: Florian Westphal <[email protected]>
Always set nf_flow_route tuple out ifindex even if the indev is not one
of the flowtable configured devices since otherwise the outdev lookup in
nf_flow_offload_ip_hook() or nf_flow_offload_ipv6_hook() for
FLOW_OFFLOAD_XMIT_NEIGH flowtable entries will fail.
The above issue occurs in the following configuration since IP6IP6
tunnel does not support flowtable acceleration yet:

$ip addr show
5: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 00:11:22:33:22:55 brd ff:ff:ff:ff:ff:ff link-netns ns1
    inet6 2001:db8:1::2/64 scope global nodad
       valid_lft forever preferred_lft forever
    inet6 fe80::211:22ff:fe33:2255/64 scope link tentative proto kernel_ll
       valid_lft forever preferred_lft forever
6: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 00:22:22:33:22:55 brd ff:ff:ff:ff:ff:ff link-netns ns3
    inet6 2001:db8:2::1/64 scope global nodad
       valid_lft forever preferred_lft forever
    inet6 fe80::222:22ff:fe33:2255/64 scope link tentative proto kernel_ll
       valid_lft forever preferred_lft forever
7: tun0@NONE: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1452 qdisc noqueue state UNKNOWN group default qlen 1000
    link/tunnel6 2001:db8:2::1 peer 2001:db8:2::2 permaddr a85:e732:2c37::
    inet6 2002:db8:1::1/64 scope global nodad
       valid_lft forever preferred_lft forever
    inet6 fe80::885:e7ff:fe32:2c37/64 scope link proto kernel_ll
       valid_lft forever preferred_lft forever

$ip -6 route show
2001:db8:1::/64 dev eth0 proto kernel metric 256 pref medium
2001:db8:2::/64 dev eth1 proto kernel metric 256 pref medium
2002:db8:1::/64 dev tun0 proto kernel metric 256 pref medium
default via 2002:db8:1::2 dev tun0 metric 1024 pref medium

$nft list ruleset
table inet filter {
        flowtable ft {
                hook ingress priority filter
                devices = { eth0, eth1 }
        }

        chain forward {
                type filter hook forward priority filter; policy accept;
                meta l4proto { tcp, udp } flow add @ft
        }
}

Fixes: b5964aa ("netfilter: flowtable: consolidate xmit path")
Signed-off-by: Lorenzo Bianconi <[email protected]>
Signed-off-by: Florian Westphal <[email protected]>
Jakub says: "We try to reserve SKIP for tests skipped because tool is
missing in env, something isn't built into the kernel etc."

use xfail, we can't force the race condition to appear at will
so its expected that the test 'fails' occasionally.

Fixes: 78a5883 ("selftests: netfilter: add conntrack clash resolution test case")
Reported-by: Jakub Kicinski <[email protected]>
Closes: https://lore.kernel.org/netdev/[email protected]/
Signed-off-by: Florian Westphal <[email protected]>
When CONFIG_SLUB_TINY is enabled, kfree_nolock() calls kasan_slab_free()
before defer_free(). On ARM64 with MTE (Memory Tagging Extension),
kasan_slab_free() poisons the memory and changes the tag from the
original (e.g., 0xf3) to a poison tag (0xfe).

When defer_free() then tries to write to the freed object to build the
deferred free list via llist_add(), the pointer still has the old tag,
causing a tag mismatch and triggering a KASAN use-after-free report:

  BUG: KASAN: slab-use-after-free in defer_free+0x3c/0xbc mm/slub.c:6537
  Write at addr f3f000000854f020 by task kworker/u8:6/983
  Pointer tag: [f3], memory tag: [fe]

Fix this by calling kasan_reset_tag() before accessing the freed memory.
This is safe because defer_free() is part of the allocator itself and is
expected to manipulate freed memory for bookkeeping purposes.

Fixes: af92793 ("slab: Introduce kmalloc_nolock() and kfree_nolock().")
Cc: [email protected]
Reported-by: [email protected]
Closes: https://syzkaller.appspot.com/bug?extid=7a25305a76d872abcfa1
Tested-by: [email protected]
Signed-off-by: Deepanshu Kartikey <[email protected]>
Acked-by: Alexei Starovoitov <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Vlastimil Babka <[email protected]>
Flower is commonly used to match on packets in many bash-based selftests.
A dump of a flower filter including statistics looks something like this:

 [
   {
     "protocol": "all",
     "pref": 49152,
     "kind": "flower",
     "chain": 0
   },
   {
     ...
     "options": {
       ...
       "actions": [
         {
	   ...
           "stats": {
             "bytes": 0,
             "packets": 0,
             "drops": 0,
             "overlimits": 0,
             "requeues": 0,
             "backlog": 0,
             "qlen": 0
           }
         }
       ]
     }
   }
 ]

The JQ query in the helper function tc_rule_stats_get() assumes this form
and looks for the second element of the array.

However, a dump of a u32 filter looks like this:

 [
   {
     "protocol": "all",
     "pref": 49151,
     "kind": "u32",
     "chain": 0
   },
   {
     "protocol": "all",
     "pref": 49151,
     "kind": "u32",
     "chain": 0,
     "options": {
       "fh": "800:",
       "ht_divisor": 1
     }
   },
   {
     ...
     "options": {
       ...
       "actions": [
         {
	   ...
           "stats": {
             "bytes": 0,
             "packets": 0,
             "drops": 0,
             "overlimits": 0,
             "requeues": 0,
             "backlog": 0,
             "qlen": 0
           }
         }
       ]
     }
   },
 ]

There's an extra element which the JQ query ends up choosing.

Instead of hard-coding a particular index, look for the entry on which a
selector .options.actions yields anything.

Signed-off-by: Petr Machata <[email protected]>
Reviewed-by: Ido Schimmel <[email protected]>
Link: https://patch.msgid.link/12982a44471c834511a0ee6c1e8f57e3a5307105.1765289566.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <[email protected]>
This test runs an overlay traffic, forwarded over a multicast-routed VXLAN
underlay. In order to determine whether packets reach their intended
destination, it uses a TC match. For convenience, it uses a flower match,
which however does not allow matching on the encapsulated packet. So
various service traffic ends up being indistinguishable from the test
packets, and ends up confusing the test. To alleviate the problem, the test
uses sleep to allow the necessary service traffic to run and clear the
channel, before running the test traffic. This worked for a while, but
lately we have nevertheless seen flakiness of the test in the CI.

Fix the issue by using u32 to match the encapsulated packet as well. The
confusing packets seem to always be IPv6 multicast listener reports.
Realistically they could be ARP or other ICMP6 traffic as well. Therefore
look for ethertype IPv4 in the IPv4 traffic test, and for IPv6 / UDP
combination in the IPv6 traffic test.

Signed-off-by: Petr Machata <[email protected]>
Reviewed-by: Ido Schimmel <[email protected]>
Link: https://patch.msgid.link/6438cb1613a2a667d3ff64089eb5994778f247af.1765289566.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <[email protected]>
After fixing traffic matching in the previous patch, the test does not need
to use the sleep anymore. So drop vx_wait() altogether, migrate all callers
of vx{10,20}_create_wait() to the corresponding _create(), and drop the now
unused _create_wait() helpers.

Signed-off-by: Petr Machata <[email protected]>
Reviewed-by: Ido Schimmel <[email protected]>
Link: https://patch.msgid.link/eabfe4fa12ae788cf3b8c5c876a989de81dfc3d3.1765289566.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <[email protected]>
Petr Machata says:

====================
selftests: forwarding: vxlan_bridge_1q_mc_ul: Fix flakiness

The net/forwarding/vxlan_bridge_1q_mc_ul selftest runs an overlay traffic,
forwarded over a multicast-routed VXLAN underlay. In order to determine
whether packets reach their intended destination, it uses a TC match. For
convenience, it uses a flower match, which however does not allow matching
on the encapsulated packet. So various service traffic ends up being
indistinguishable from the test packets, and ends up confusing the test. To
alleviate the problem, the test uses sleep to allow the necessary service
traffic to run and clear the channel, before running the test traffic. This
worked for a while, but lately we have nevertheless seen flakiness of the
test in the CI.

In this patchset, first generalize tc_rule_stats_get() to support u32 in
patch #1, then in patch #2 convert the test to use u32 to allow parsing
deeper into the packet, and in #3 drop the now-unnecessary sleep.
====================

Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
…l/git/netfilter/nf

Florian Westphal says:

====================
netfilter: updates for net

1) Fix refcount leaks in nf_conncount, from Fernando Fernandez Mancera.
   This addresses a recent regression that came in the last -next
   pull request.

2) Fix a null dereference in route error handling in IPVS, from Slavin
   Liu.  This is an ancient issue dating back to 5.1 days.

3) Always set ifindex in route tuple in the flowtable output path, from
   Lorenzo Bianconi.  This bug came in with the recent output path refactoring.

4) Prefer 'exit $ksft_xfail' over 'exit $ksft_skip' when we fail to
   trigger a nat race condition to exercise the clash resolution path in
   selftest infra, $ksft_skip should be reserved for missing tooling,
   From myself.

* tag 'nf-25-12-10' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  selftests: netfilter: prefer xfail in case race wasn't triggered
  netfilter: always set route tuple out ifindex
  ipvs: fix ipv4 null-ptr-deref in route error path
  netfilter: nf_conncount: fix leaked ct in error paths
====================

Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
…/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2025-12-10

Arnd Bergmann's patch fixes a build dependency with the CAN protocols
and drivers introduced in the current development cycle.

The last patch is by me and fixes the error handling cleanup in the
gs_usb driver.

* tag 'linux-can-fixes-for-6.19-20251210' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
  can: gs_usb: gs_can_open(): fix error handling
  can: fix build dependency
====================

Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
Some Potron SFP+ XGSPON ONU sticks are shipped with different EEPROM
vendor ID and vendor name strings, but are otherwise functionally
identical to the existing "Potron SFP+ XGSPON ONU Stick" handled by
sfp_quirk_potron().

These modules, including units distributed under the "Better Internet"
branding, use the same UART pin assignment and require the same
TX_FAULT/LOS behaviour and boot delay. Re-use the existing Potron
quirk for this EEPROM variant.

Signed-off-by: Marcus Hughes <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
The cffrml_receive() function extracts a length field from the packet
header and, when FCS is disabled, subtracts 2 from this length without
validating that len >= 2.

If an attacker sends a malicious packet with a length field of 0 or 1
to an interface with FCS disabled, the subtraction causes an integer
underflow.

This can lead to memory exhaustion and kernel instability, potential
information disclosure if padding contains uninitialized kernel memory.

Fix this by validating that len >= 2 before performing the subtraction.

Reported-by: Yuhao Jiang <[email protected]>
Reported-by: Junrui Luo <[email protected]>
Fixes: b482cd2 ("net-caif: add CAIF core protocol stack")
Signed-off-by: Junrui Luo <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://patch.msgid.link/SYBPR01MB7881511122BAFEA8212A1608AFA6A@SYBPR01MB7881.ausprd01.prod.outlook.com
Signed-off-by: Jakub Kicinski <[email protected]>
…o strict

Whenever a user issues an ets qdisc change command, transforming a
drr class into a strict one, the ets code isn't checking whether that
class was in the active list and removing it. This means that, if a
user changes a strict class (which was in the active list) back to a drr
one, that class will be added twice to the active list [1].

Doing so with the following commands:

tc qdisc add dev lo root handle 1: ets bands 2 strict 1
tc qdisc add dev lo parent 1:2 handle 20: \
    tbf rate 8bit burst 100b latency 1s
tc filter add dev lo parent 1: basic classid 1:2
ping -c1 -W0.01 -s 56 127.0.0.1
tc qdisc change dev lo root handle 1: ets bands 2 strict 2
tc qdisc change dev lo root handle 1: ets bands 2 strict 1
ping -c1 -W0.01 -s 56 127.0.0.1

Will trigger the following splat with list debug turned on:

[   59.279014][  T365] ------------[ cut here ]------------
[   59.279452][  T365] list_add double add: new=ffff88801d60e350, prev=ffff88801d60e350, next=ffff88801d60e2c0.
[   59.280153][  T365] WARNING: CPU: 3 PID: 365 at lib/list_debug.c:35 __list_add_valid_or_report+0x17f/0x220
[   59.280860][  T365] Modules linked in:
[   59.281165][  T365] CPU: 3 UID: 0 PID: 365 Comm: tc Not tainted 6.18.0-rc7-00105-g7e9f13163c13-dirty #239 PREEMPT(voluntary)
[   59.281977][  T365] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[   59.282391][  T365] RIP: 0010:__list_add_valid_or_report+0x17f/0x220
[   59.282842][  T365] Code: 89 c6 e8 d4 b7 0d ff 90 0f 0b 90 90 31 c0 e9 31 ff ff ff 90 48 c7 c7 e0 a0 22 9f 48 89 f2 48 89 c1 4c 89 c6 e8 b2 b7 0d ff 90 <0f> 0b 90 90 31 c0 e9 0f ff ff ff 48 89 f7 48 89 44 24 10 4c 89 44
...
[   59.288812][  T365] Call Trace:
[   59.289056][  T365]  <TASK>
[   59.289224][  T365]  ? srso_alias_return_thunk+0x5/0xfbef5
[   59.289546][  T365]  ets_qdisc_change+0xd2b/0x1e80
[   59.289891][  T365]  ? __lock_acquire+0x7e7/0x1be0
[   59.290223][  T365]  ? __pfx_ets_qdisc_change+0x10/0x10
[   59.290546][  T365]  ? srso_alias_return_thunk+0x5/0xfbef5
[   59.290898][  T365]  ? __mutex_trylock_common+0xda/0x240
[   59.291228][  T365]  ? __pfx___mutex_trylock_common+0x10/0x10
[   59.291655][  T365]  ? srso_alias_return_thunk+0x5/0xfbef5
[   59.291993][  T365]  ? srso_alias_return_thunk+0x5/0xfbef5
[   59.292313][  T365]  ? trace_contention_end+0xc8/0x110
[   59.292656][  T365]  ? srso_alias_return_thunk+0x5/0xfbef5
[   59.293022][  T365]  ? srso_alias_return_thunk+0x5/0xfbef5
[   59.293351][  T365]  tc_modify_qdisc+0x63a/0x1cf0

Fix this by always checking and removing an ets class from the active list
when changing it to strict.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/tree/net/sched/sch_ets.c?id=ce052b9402e461a9aded599f5b47e76bc727f7de#n663

Fixes: cd9b50a ("net/sched: ets: fix crash when flipping from 'strict' to 'quantum'")
Acked-by: Jamal Hadi Salim <[email protected]>
Signed-off-by: Victor Nogueira <[email protected]>
Reviewed-by: Petr Machata <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
…t misplacements

Add a test case for a bug fixed by Jamal [1] and for scenario where an
ets drr class is inserted into the active list twice.

- Try to delete ets drr class' qdisc while still keeping it in the
  active list
- Try to add ets class to the active list twice

[1] https://lore.kernel.org/netdev/[email protected]/

Acked-by: Jamal Hadi Salim <[email protected]>
Signed-off-by: Victor Nogueira <[email protected]>
Reviewed-by: Petr Machata <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
Set the error code if "transferred != sizeof(cmd)" instead of
returning success.

Fixes: dbafc28 ("NFC: pn533: don't send USB data off of the stack")
Signed-off-by: Dan Carpenter <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
I'm retiring from maintaining netfilter. I'll still keep an
eye on ipset and respond to anything related to it.

Thank you!

Signed-off-by: Jozsef Kadlecsik <[email protected]>
Signed-off-by: Florian Westphal <[email protected]>
Jakub reports spurious failures of the 'conntrack_reverse_clash.sh'
selftest.  A bogus test makes nat core resort to port rewrite even
though there is no need for this.

When the test is made, nf_nat_used_tuple() would already have caused us
to return if no other CPU had added a colliding entry.
Moreover, nf_nat_used_tuple() would have ignored the colliding entry if
their origin tuples had been the same.

All that is left to check is if the colliding entry in the hash table
is subject to NAT, and, if its not, if our entry matches in the reverse
direction, e.g. hash table has

addr1:1234 -> addr2:80, and we want to commit
addr2:80   -> addr1:1234.

Because we already checked that neither the new nor the committed entry is
subject to NAT we only have to check origin vs. reply tuple:
for non-nat entries, the reply tuple is always the inverted original.

Just in case there are more problems extend the error reporting
in the selftest while at it and dump conntrack table/stats on error.

Reported-by: Jakub Kicinski <[email protected]>
Closes: https://lore.kernel.org/netdev/[email protected]/
Fixes: d8f84a9 ("netfilter: nf_nat: don't try nat source port reallocation for reverse dir clash")
Signed-off-by: Florian Westphal <[email protected]>
…tore

This validation predates the introduction of the state machine that
determines when to enter slow path validation for error reporting.

Currently, table validation is perform when:

- new rule contains expressions that need validation.
- new set element with jump/goto verdict.

Validation on register store skips most checks with no basechains, still
this walks the graph searching for loops and ensuring expressions are
called from the right hook. Remove this.

Fixes: a654de8 ("netfilter: nf_tables: fix chain dependency validation")
Signed-off-by: Pablo Neira Ayuso <[email protected]>
Signed-off-by: Florian Westphal <[email protected]>
There's a lot of unnecessary whitespace damage in this
file: space before tabs, etc., that has no formatting
or readability effect or advantages.

Fix them.

Signed-off-by: Ingo Molnar <[email protected]>
Link: https://patch.msgid.link/176535283007.498.16442167388418039352.tip-bot2@tip-bot2
lkkbd_interrupt() schedules lk->tq via schedule_work(), and the work
handler lkkbd_reinit() dereferences the lkkbd structure and its
serio/input_dev fields.

lkkbd_disconnect() and error paths in lkkbd_connect() free the lkkbd
structure without preventing the reinit work from being queued again
until serio_close() returns. This can allow the work handler to run
after the structure has been freed, leading to a potential use-after-free.

Use disable_work_sync() instead of cancel_work_sync() to ensure the
reinit work cannot be re-queued, and call it both in lkkbd_disconnect()
and in lkkbd_connect() error paths after serio_open().

Signed-off-by: Minseong Kim <[email protected]>
Cc: [email protected]
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Dmitry Torokhov <[email protected]>
…rning EDEADLK

Fedora/CentOS/RHEL CI is reporting intermittent failures while running
the KUnit tests present in drm_hdmi_state_helper_test.c [1].

While the specific test causing the failure change between runs, all of
them are caused by drm_kunit_helper_enable_crtc_connector() returning
-EDEADLK. The error trace always follow this structure:

    # <test name>: ASSERTION FAILED at
    # drivers/gpu/drm/tests/drm_hdmi_state_helper_test.c:<line>
    Expected ret == 0, but
        ret == -35 (0xffffffffffffffdd)

As documented, if the drm_kunit_helper_enable_crtc_connector() function
returns -EDEADLK (-35), the entire atomic sequence must be restarted.

Handle this error code for all function calls.

Closes: https://datawarehouse.cki-project.org/issue/4039  [1]
Fixes: 6a5c0ad ("drm/tests: hdmi_state_helpers: Switch to new helper")
Reviewed-by: Maxime Ripard <[email protected]>
Signed-off-by: José Expósito <[email protected]>
Link: https://patch.msgid.link/[email protected]
Fedora/CentOS/RHEL CI is reporting intermittent failures while running
the drm_test_check_valid_clones() KUnit test.

The error log can be either [1]:

    # drm_test_check_valid_clones: ASSERTION FAILED at
    # drivers/gpu/drm/tests/drm_atomic_state_test.c:295
    Expected ret == param->expected_result, but
        ret == -35 (0xffffffffffffffdd)
        param->expected_result == 0 (0x0)

Or [2] depending on the test case:

    # drm_test_check_valid_clones: ASSERTION FAILED at
    # drivers/gpu/drm/tests/drm_atomic_state_test.c:295
    Expected ret == param->expected_result, but
        ret == -35 (0xffffffffffffffdd)
        param->expected_result == -22 (0xffffffffffffffea)

Restart the atomic sequence when EDEADLK is returned.

[1] https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/trusted-artifacts/2113057246/test_x86_64/11802139999/artifacts/jobwatch/logs/recipes/19824965/tasks/204347800/results/946112713/logs/dmesg.log
[2] https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/trusted-artifacts/2106744297/test_aarch64/11762450907/artifacts/jobwatch/logs/recipes/19797942/tasks/204139727/results/945094561/logs/dmesg.log

Fixes: 88849f2 ("drm/tests: Add test for drm_atomic_helper_check_modeset()")
Closes: https://datawarehouse.cki-project.org/issue/4004
Reviewed-by: Maxime Ripard <[email protected]>
Signed-off-by: José Expósito <[email protected]>
Link: https://patch.msgid.link/[email protected]
Fedora/CentOS/RHEL CI is reporting intermittent failures while running
the drm_validate_modeset test [1]:

    # drm_test_check_connector_changed_modeset: EXPECTATION FAILED at
    # drivers/gpu/drm/tests/drm_atomic_state_test.c:162
    Expected ret == 0, but
        ret == -35 (0xffffffffffffffdd)

Change the set_up_atomic_state() helper function to return on error and
restart the atomic sequence when the returned error is EDEADLK.

[1] https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/trusted-artifacts/2106744096/test_x86_64/11762450343/artifacts/jobwatch/logs/recipes/19797909/tasks/204139142/results/945095586/logs/dmesg.log

Fixes: 73d934d ("drm/tests: Add test for drm_atomic_helper_commit_modeset_disables()")
Closes: https://datawarehouse.cki-project.org/issue/4004
Reviewed-by: Maxime Ripard <[email protected]>
Signed-off-by: José Expósito <[email protected]>
Link: https://patch.msgid.link/[email protected]
Add missing drm_gem_object_put() call when drm_gem_object_lookup()
successfully returns an object. This fixes a GEM object reference
leak that can prevent driver modules from unloading when using
prime buffers.

Fixes: 5309672 ("drm: Add DRM prime interface to reassign GEM handle")
Cc: <[email protected]> # v6.18+
Signed-off-by: Karol Wachowski <[email protected]>
Reviewed-by: Christian König <[email protected]>
Reviewed-by: Maciej Falkowski <[email protected]>
Signed-off-by: Christian König <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
frank-w and others added 29 commits December 19, 2025 09:18
…r4 (pro)

Build devicetree binaries for testing overlays and providing users
full dtb without using overlays for Bananapi R4 (pro) variants.

Signed-off-by: Frank Wunderlich <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Rob Herring (Arm) <[email protected]>
It's a requirement that DT overlays be applied at build time in order to
validate them as overlays are not validated on their own.

Add missing target for mt8395-radxa hd panel overlay.

Fixes: 4c8ff61 ("arm64: dts: mediatek: mt8395-radxa-nio-12l: Add Radxa 8 HD panel")
Signed-off-by: Frank Wunderlich <[email protected]>
Acked-by: AngeloGioacchino Del Regno <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Rob Herring (Arm) <[email protected]>
The reset_history attributes are write only. Hence don't report them as
readable just to return -EOPNOTSUPP later on.

Fixes: cbc2953 ("hwmon: Add driver for LTC4282")
Signed-off-by: Nuno Sá <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Guenter Roeck <[email protected]>
…nux/kernel/git/axboe/linux

Pull io_uring fix from Jens Axboe:
 "Just a single fix this week, for an issue with the calculation of the
  number of segments in the ublk kbuf import path"

* tag 'io_uring-6.19-20251218' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
  io_uring: fix nr_segs calculation in io_import_kbuf
…/kernel/git/axboe/linux

Pull block fixes from Jens Axboe:

 - ublk selftests for missing coverage

 - two fixes for the block integrity code

 - fix for the newly added newly added PR read keys ioctl, limiting the
   memory that can be allocated

 - work around for a deadlock that can occur with ublk, where partition
   scanning ends up recursing back into file closure, which needs the
   same mutex grabbed. Not the prettiest thing in the world, but an
   acceptable work-around until we can eliminate the reliance on
   disk->open_mutex for this

 - fix for a race between enabling writeback throttling and new IO
   submissions

 - move a bit of bio flag handling code. No changes, but needed for a
   patchset for a future kernel

 - fix for an init time id leak failure in rnbd

 - loop/zloop state check fix

* tag 'block-6.19-20251218' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
  block: validate interval_exp integrity limit
  block: validate pi_offset integrity limit
  block: rnbd-clt: Fix leaked ID in init_dev()
  ublk: fix deadlock when reading partition table
  block: add allocation size check in blkdev_pr_read_keys()
  Documentation: admin-guide: blockdev: replace zone_capacity with zone_capacity_mb when creating devices
  zloop: use READ_ONCE() to read lo->lo_state in queue_rq path
  loop: use READ_ONCE() to read lo->lo_state without locking
  block: fix race between wbt_enable_default and IO submission
  selftests: ublk: add user copy test cases
  selftests: ublk: add support for user copy to kublk
  selftests: ublk: forbid multiple data copy modes
  selftests: ublk: don't share backing files between ublk servers
  selftests: ublk: use auto_zc for PER_IO_DAEMON tests in stress_04
  selftests: ublk: fix fio arguments in run_io_and_recover()
  selftests: ublk: remove unused ios map in seq_io.bt
  selftests: ublk: correct last_rw map type in seq_io.bt
  selftests: ublk: fix overflow in ublk_queue_auto_zc_fallback()
  block: move around bio flagging helpers
…ux/kernel/git/iommu/linux

Pull iommu fixes from Joerg Roedel:

 - iommupt: Fix an oops found by syzcaller in the new generic
   IO-page-table code.

 - AMD-Vi: Fix IO_PAGE_FAULTs in kdump kernels triggered by re-using
   domain-ids from previous kernel.

* tag 'iommu-fixes-v6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux:
  amd/iommu: Make protection domain ID functions non-static
  amd/iommu: Preserve domain ids inside the kdump kernel
  iommupt: Return ERR_PTR from _table_alloc()
…ernel/git/vbabka/slab

Pull slab fix from Vlastimil Babka:

 - A stable fix for a missing tag reset that can happen in
   kfree_nolock() with KASAN+SLUB_TINY configs (Deepanshu Kartikey)

* tag 'slab-for-6.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
  mm/slub: reset KASAN tag in defer_free() before accessing freed memory
…nux/kernel/git/xen/tip

Pull xen fix from Juergen Gross:
 "Just a single patch fixing a sparse warning"

* tag 'for-linus-6.19-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  x86/xen: Fix sparse warning in enlighten_pv.c
Pull x86 kvm fixes from Paolo Bonzini:
 "x86 fixes.  Everyone else is already in holiday mood apparently.

   - Add a missing 'break' to fix param parsing in the rseq selftest

   - Apply runtime updates to the _current_ CPUID when userspace is
     setting CPUID, e.g. as part of vCPU hotplug, to fix a false
     positive and to avoid dropping the pending update

   - Disallow toggling KVM_MEM_GUEST_MEMFD on an existing memslot, as
     it's not supported by KVM and leads to a use-after-free due to KVM
     failing to unbind the memslot from the previously-associated
     guest_memfd instance

   - Harden against similar KVM_MEM_GUEST_MEMFD goofs, and prepare for
     supporting flags-only changes on KVM_MEM_GUEST_MEMFD memlslots,
     e.g. for dirty logging

   - Set exit_code[63:32] to -1 (all 0xffs) when synthesizing a nested
     SVM_EXIT_ERR (a.k.a. VMEXIT_INVALID) #VMEXIT, as VMEXIT_INVALID is
     defined as -1ull (a 64-bit value)

   - Update SVI when activating APICv to fix a bug where a
     post-activation EOI for an in-service IRQ would effective be lost
     due to SVI being stale

   - Immediately refresh APICv controls (if necessary) on a nested
     VM-Exit instead of deferring the update via KVM_REQ_APICV_UPDATE,
     as the request is effectively ignored because KVM thinks the vCPU
     already has the correct APICv settings"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: nVMX: Immediately refresh APICv controls as needed on nested VM-Exit
  KVM: VMX: Update SVI during runtime APICv activation
  KVM: nSVM: Set exit_code_hi to -1 when synthesizing SVM_EXIT_ERR (failed VMRUN)
  KVM: nSVM: Clear exit_code_hi in VMCB when synthesizing nested VM-Exits
  KVM: Harden and prepare for modifying existing guest_memfd memslots
  KVM: Disallow toggling KVM_MEM_GUEST_MEMFD on an existing memslot
  KVM: selftests: Add a CPUID testcase for KVM_SET_CPUID2 with runtime updates
  KVM: x86: Apply runtime updates to current CPUID during KVM_SET_CPUID{,2}
  KVM: selftests: Add missing "break" in rseq_test's param parsing
…git/arm64/linux

Pull arm64 fixes from Catalin Marinas:
 "Two left-over updates that could not go into -rc1 due to conflicts
  with other series:

   - Simplify checks in arch_kfence_init_pool() since
     force_pte_mapping() already takes BBML2-noabort (break-before-make
     Level 2 with no aborts generated) into account

   - Remove unneeded SVE/SME fallback preserve/store handling in the
     arm64 EFI. With the recent updates, the fallback path is only taken
     for EFI runtime calls from hardirq or NMI contexts. In practice,
     this only happens under panic/oops/emergency_restart() and no
     restoring of the user state expected.

     There's a corresponding lkdtm update to trigger a BUG() or panic()
     from hardirq context together with a fixup not to confuse
     clang/objtool about the control flow

  GCS (guarded control stacks) fix: flush the GCS locking state on exec,
  otherwise the new task will not be able to enable GCS (locked as
  disabled)"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  lkdtm/bugs: Do not confuse the clang/objtool with busy wait loop
  arm64/gcs: Flush the GCS locking state on exec
  arm64/efi: Remove unneeded SVE/SME fallback preserve/store handling
  lkdtm/bugs: Add cases for BUG and PANIC occurring in hardirq context
  arm64: mm: Simplify check in arch_kfence_init_pool()
…ernel/git/mips/linux

Pull MIPS fixes from Thomas Bogendoerfer:

 - Fix build error for Alchemy

 - Fix reference leak

* tag 'mips-fixes_6.19_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
  MIPS: Fix a reference leak bug in ip22_check_gio()
  MIPS: Alchemy: Remove bogus static/inline specifiers
…cm/linux/kernel/git/robh/linux

Pull devicetree fixes from Rob Herring:

 - Fix warnings for Mediatek overlays not getting applied

 - Fix regression in handling elfcorehdr region

 - Fix creating cpufreq device on OPPv1 platforms

 - Add GE7800 GPU in Renesas R-Car V3U

 - Simplify dma-coherent property in TI display bindings

 - Allow "reg" in sprd,sc9860-clk binding

 - Update Linus Walleij's email

* tag 'devicetree-fixes-for-6.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  arm64: dts: mediatek: Apply mt8395-radxa DT overlay at build time
  arm64: dts: mediatek: mt7988: add dtbs with applied overlays for bpi-r4 (pro)
  arm64: dts: mediatek: mt7986: add dtbs with applied overlays for bpi-r3
  dt-bindings: Updates Linus Walleij's mail address
  dt-bindings: gpu: img,powervr-rogue: Document GE7800 GPU in Renesas R-Car V3U
  cpufreq: dt-platdev: Fix creating device on OPPv1 platforms
  dt-bindings: clock: sprd,sc9860-clk: Allow "reg" for gate clocks
  dt-bindings: display/ti: Simplify dma-coherent property
  arm64: kdump: Fix elfcorehdr overlap caused by reserved memory processing reorder
…org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kunit fixes from Shuah Khan:
 "Drop unused parameter from kunit_device_register_internal and make
  FAULT_TEST default to n when PANIC_ON_OOPS"

* tag 'linux_kselftest-kunit-fixes-6.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  kunit: make FAULT_TEST default to n when PANIC_ON_OOPS
  kunit: Drop unused parameter from kunit_device_register_internal
…m/kernel

Pull drm fixes from Dave Airlie:
 "rc2 fixes for the week, mostly xe, with amdgpu as usual. Then a
  smattering of small fixes across the core/tests/panel and amdxdna.

  I expect things will be quiet for rc3/4 as teams take a break, and I'm
  travelling but will keep an eye on things.

  core:
   - fix gem handle leak on DRM_IOCTL_GEM_CHANGE_HANDLE

  tests:
   - add EDEADLK handling

  amdgpu:
   - Fix no_console_suspend handling
   - DCN 3.5.x seamless boot fixes
   - DP audio fix
   - Fix race in GPU recovery
   - SMU 14 OD fix

  amdkfd:
   - Event fix

  xe:
   - Limit num_syncs to prevent oversized kernel allocations
   - Disallow 0 OA property values
   - Disallow 0 EU stall property values
   - Fix kobject leak
   - Workaround
   - Loop variable reference fix
   - Fix a CONFIG corner-case incorrect number of argument
   - Skip reason prefix while emitting array
   - VF migration fix
   - Fix context in mei interrupt top half
   - Don't include the CCS metadata in the dma-buf sg-table
   - VF queueing recovery work fix
   - Increase TDF timeout
   - GT reset registers vs scheduler ordering fix
   - Adjust long-running workload timeslices
   - Always set OA_OAGLBCTXCTRL_COUNTER_RESUME
   - Fix a return value
   - Drop preempt-fences when destroying imported dma-bufs
   - Use usleep_range for accurate long-running workload timeslicing

  amdxdna:
   - don't load virtualized

  panel:
   - fix visionox-rm69299 Kconfig dependency
   - sony-td4353-jdi probing fix"

* tag 'drm-fixes-2025-12-20' of https://gitlab.freedesktop.org/drm/kernel: (34 commits)
  drm/xe: Use usleep_range for accurate long-running workload timeslicing
  drm/xe: Drop preempt-fences when destroying imported dma-bufs.
  drm/xe/eustall: Disallow 0 EU stall property values
  drm/xe/oa: Disallow 0 OA property values
  drm/xe/xe_sriov_vfio: Fix return value in xe_sriov_vfio_migration_supported()
  drm/xe/oa: Always set OAG_OAGLBCTXCTRL_COUNTER_RESUME
  drm/xe: Adjust long-running workload timeslices to reasonable values
  drm/xe/oa: Limit num_syncs to prevent oversized allocations
  drm/xe: Limit num_syncs to prevent oversized allocations
  drm/amdkfd: Fix improper NULL termination of queue restore SMI event string
  drm/amd/pm: restore SCLK settings after S0ix resume
  drm/amdgpu: fix a job->pasid access race in gpu recovery
  drm/amd/display: Fix DP no audio issue
  drm/amd/display: Fix scratch registers offsets for DCN351
  drm/amd/display: Fix scratch registers offsets for DCN35
  drm/amd: Resume the device in thaw() callback when console suspend is disabled
  drm/panel: visionox-rm69299: Depend on BACKLIGHT_CLASS_DEVICE
  accel/amdxdna: Block running under a hypervisor
  drm/panel: sony-td4353-jdi: Enable prepare_prev_first
  drm/xe: Restore engine registers before restarting schedulers after GT reset
  ...
…l/git/ulfh/mmc

Pull MMC host fixes from Ulf Hansson:

 - sdhci-esdhc-imx: Fix build problem dependency

 - sdhci-of-arasan: Increase card-detect stable timeout to 2 seconds

 - sdhci-of-aspeed: Fix DT doc for missing properties

* tag 'mmc-v6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: sdhci-esdhc-imx: add alternate ARCH_S32 dependency to Kconfig
  mmc: sdhci-of-arasan: Increase CD stable timeout to 2 seconds
  dt-bindings: mmc: sdhci-of-aspeed: Switch ref to sdhci-common.yaml
…/kernel/git/groeck/linux-staging

Pull hwmon fixes from Guenter Roeck:

 - ltc4282: Fix reset_history file permissions

 - ds620: Update broken Datasheet URL in driver documentation

 - tmp401: Fix overflow caused by default conversion rate value

 - ibmpex: Fix use-after-free in high/low store

 - dell-smm: Limit fan multiplier to avoid overflow

* tag 'hwmon-for-v6.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (ltc4282): Fix reset_history file permissions
  hwmon: (DS620) Update broken Datasheet URL in driver documentation
  hwmon: (tmp401) fix overflow caused by default conversion rate value
  hwmon: (ibmpex) fix use-after-free in high/low store
  hwmon: (dell-smm) Limit fan multiplier to avoid overflow
…/xfs-linux

Pull xfs fixes from Carlos Maiolino:
 "This contains a few fixes for zoned devices support, an UAF and a
  compiler warning, and some cleaning up"

* tag 'xfs-fixes-6.19-rc2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: fix the zoned RT growfs check for zone alignment
  xfs: validate that zoned RT devices are zone aligned
  xfs: fix XFS_ERRTAG_FORCE_ZERO_RANGE for zoned file system
  xfs: fix a memory leak in xfs_buf_item_init()
  xfs: fix stupid compiler warning
  xfs: fix a UAF problem in xattr repair
  xfs: ignore discard return value
Work around clang problems with "=rm" asm constraint.

clang seems to always chose the memory output, while it is almost
always the worst choice.

Add ASM_OUTPUT_RM so that we can replace "=rm" constraint
where it matters for clang, while not penalizing gcc.

Signed-off-by: Eric Dumazet <[email protected]>
Suggested-by: Uros Bizjak <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
clang is generating very inefficient code for native_save_fl() which is
used for local_irq_save() in critical spots.

Allowing the "pop %0" to use memory:

 1) forces the compiler to add annoying stack canaries when
    CONFIG_STACKPROTECTOR_STRONG=y in many places.

 2) Almost always is followed by an immediate "move memory,register"

One good example is _raw_spin_lock_irqsave, with 8 extra instructions

  ffffffff82067a30 <_raw_spin_lock_irqsave>:
  ffffffff82067a30:		...
  ffffffff82067a39:		53						push   %rbx

  // Three instructions to ajust the stack, read the per-cpu canary
  // and copy it to 8(%rsp)
  ffffffff82067a3a:		48 83 ec 10 			sub    $0x10,%rsp
  ffffffff82067a3e:		65 48 8b 05 da 15 45 02 mov    %gs:0x24515da(%rip),%rax 	   # <__stack_chk_guard>
  ffffffff82067a46:		48 89 44 24 08			mov    %rax,0x8(%rsp)

  ffffffff82067a4b:		9c						pushf

  // instead of pop %rbx, compiler uses 2 instructions.
  ffffffff82067a4c:		8f 04 24				pop    (%rsp)
  ffffffff82067a4f:		48 8b 1c 24 			mov    (%rsp),%rbx

  ffffffff82067a53:		fa						cli
  ffffffff82067a54:		b9 01 00 00 00			mov    $0x1,%ecx
  ffffffff82067a59:		31 c0					xor    %eax,%eax
  ffffffff82067a5b:		f0 0f b1 0f 			lock cmpxchg %ecx,(%rdi)
  ffffffff82067a5f:		75 1d					jne    ffffffff82067a7e <_raw_spin_lock_irqsave+0x4e>

  // three instructions to check the stack canary
  ffffffff82067a61:		65 48 8b 05 b7 15 45 02 mov    %gs:0x24515b7(%rip),%rax 	   # <__stack_chk_guard>
  ffffffff82067a69:		48 3b 44 24 08			cmp    0x8(%rsp),%rax
  ffffffff82067a6e:		75 17					jne    ffffffff82067a87

  ...

  // One extra instruction to adjust the stack.
  ffffffff82067a73:		48 83 c4 10 			add    $0x10,%rsp
  ...

  // One more instruction in case the stack was mangled.
  ffffffff82067a87:		e8 a4 35 ff ff			call   ffffffff8205b030 <__stack_chk_fail>

This patch changes nothing for gcc, but for clang saves ~20000 bytes of text
even though more functions are inlined.

  $ size vmlinux.gcc.before vmlinux.gcc.after vmlinux.clang.before vmlinux.clang.after
     text	   data		bss		dec		hex	filename
  45565821	25005462	4704800	75276083	47c9f33	vmlinux.gcc.before
  45565821	25005462	4704800	75276083	47c9f33	vmlinux.gcc.after
  45121072	24638617	5533040	75292729	47ce039	vmlinux.clang.before
  45093887	24638633	5536808	75269328	47c84d0	vmlinux.clang.after

  $ scripts/bloat-o-meter -t vmlinux.clang.before vmlinux.clang.after
  add/remove: 1/2 grow/shrink: 21/533 up/down: 2250/-22112 (-19862)

Signed-off-by: Eric Dumazet <[email protected]>
Cc: Uros Bizjak <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
…ernel/git/broonie/spi

Pull spi fixes from Mark Brown:
 "A small collection of fixes for various SPI drivers, plus a relaxation
  of constraints in the DT for the DesignWare controller to reflect
  hardware that's been seen.

  There's several fixes for the Cadence QuadSPI driver since a fix
  during the last release made some existing issues with error handling
  during probe more readily visible"

* tag 'spi-fix-v6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: mt65xx: Use IRQF_ONESHOT with threaded IRQ
  spi: dt-bindings: snps,dw-abp-ssi: Allow up to 16 chip-selects
  spi: cadence-quadspi: Fix clock disable on probe failure path
  spi: cadence-quadspi: Add error logging for DMA request failure
  spi: fsl-cpm: Check length parity before switching to 16 bit mode
  spi: mpfs: Fix an error handling path in mpfs_spi_probe()
…/git/libata/linux

Pull ata fix from Damien Le Moal:

 - Disable link power management (LPM) for a Seagate drive that is
   misbehaving when LPM is enabled

* tag 'ata-6.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
  ata: libata-core: Disable LPM on ST2000DM008-2FR102
Enhance the coccicheck script to filter *.cocci files based on the
specified MODE (e.g., report, patch). This ensures that only compatible
semantic patch files are executed, preventing errors such as:

    "virtual rule report not supported"

This error occurs when a .cocci file does not define a 'virtual <MODE>'
rule, yet is executed in that mode.

For example:

    make coccicheck M=drivers/hwtracing/coresight/ MODE=report

In this case, running "secs_to_jiffies.cocci" would trigger the error
because it lacks support for 'report' mode. With this change, such files
are skipped automatically, improving robustness and developer
experience.

Signed-off-by: Songwei Chai <[email protected]>
Reviewed-by: Julia Lawall <[email protected]>
s/Unecessary/Unnecessary/

Reviewed-by: Julia Lawall <[email protected]>
Signed-off-by: Thorsten Blum <[email protected]>
…ux/kernel/git/tip/tip

Pull irq fix from Ingo Molnar:
 "Fix IRQ thread affinity flags setup regression"

* tag 'irq-urgent-2025-12-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq: Don't overwrite interrupt thread flags on setup
…ux/kernel/git/tip/tip

Pull x86 fixes from Ingo Molnar:

 - Fix FPU core dumps on certain CPU models

 - Fix htmldocs build warning

 - Export TLB tracing event name via header

 - Remove unused constant from <linux/mm_types.h>

 - Fix comments

 - Fix whitespace noise in documentation

 - Fix variadic structure's definition to un-confuse UBSAN

 - Fix posted MSI interrupts irq_retrigger() bug

 - Fix asm build failure with older GCC builds

* tag 'x86-urgent-2025-12-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/bug: Fix old GCC compile fails
  x86/msi: Make irq_retrigger() functional for posted MSI
  x86/platform/uv: Fix UBSAN array-index-out-of-bounds
  mm: Remove tlb_flush_reason::NR_TLB_FLUSH_REASONS from <linux/mm_types.h>
  x86/mm/tlb/trace: Export the TLB_REMOTE_WRONG_CPU enum in <trace/events/tlb.h>
  x86/sgx: Remove unmatched quote in __sgx_encl_extend function comment
  x86/boot/Documentation: Fix whitespace noise in boot.rst
  x86/fpu: Fix FPU state core dump truncation on CPUs with no extended xfeatures
  x86/boot/Documentation: Fix htmldocs build warning due to malformed table in boot.rst
…rnel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:

 - bcm, pxa, rcar: fix void-pointer-to-enum-cast warning

 - new hardware IDs / DT bindings for
    - Intel Nova Lake-S
    - Mobileye
    - Qualcomm SM8750

* tag 'i2c-for-6.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  dt-bindings: i2c: qcom-cci: Document SM8750 compatible
  i2c: i801: Add support for Intel Nova Lake-S
  dt-bindings: i2c: dw: Add Mobileye I2C controllers
  i2c: rcar: Fix Wvoid-pointer-to-enum-cast warning
  i2c: pxa: Fix Wvoid-pointer-to-enum-cast warning
  i2c: bcm-iproc: Fix Wvoid-pointer-to-enum-cast warning
…/kernel/git/dtor/input

Pull input fixes from Dmitry Torokhov:

 - a quirk for i8042 to better handle another TUXEDO model

 - a quirk to atkbd to handle incorcet behavior of HONOR FMB-P internal
   keyboard

 - a definition for a new ABS_SND_PROFILE event

 - fixes to alps and lkkbd drivers to reliably shut down pending work on
   removal

 - a fix to apple_z2 driver tightening input report parsing

 - a fix for "off-by-one" error when validating config in ti_am335x_tsc
   driver

 - addition of CRKD Guitars device IDs to xpad driver.

* tag 'input-for-v6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: ti_am335x_tsc - fix off-by-one error in wire_order validation
  Input: xpad - add support for CRKD Guitars
  Input: add ABS_SND_PROFILE
  Input: apple_z2 - fix reading incorrect reports after exiting sleep
  Input: alps - fix use-after-free bugs caused by dev3_register_work
  Input: i8042 - add TUXEDO InfinityBook Max Gen10 AMD to i8042 quirk table
  Input: atkbd - skip deactivate for HONOR FMB-P's internal keyboard
  Input: lkkbd - disable pending work before freeing device
…nel/git/jlawall/linux

Pull Coccinelle fixes from Julia Lawall:
 "These fix a typo and make the coccicheck script more robust by
  ensuring that only compatible semantic patches are executed for the
  chosen mode"

* tag 'coccinelle-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/jlawall/linux:
  Coccinelle: pm_runtime: Fix typo in report message
  scripts: coccicheck: filter *.cocci files by MODE
@sunflower2333 sunflower2333 merged commit 177e54b into sunflower2333:master Dec 22, 2025
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.