Skip to content

Commit 663dc86

Browse files
ixhamzabehlendorf
authored andcommitted
Fix taskq NULL pointer dereference on timer race
Remove unsafe timer_pending() check in taskq_cancel_id() that created a race where: - Timer expires and timer_pending() returns FALSE - task_done() frees task with tqent_func = NULL - Timer callback executes and queues freed task - Worker thread crashes executing NULL function Always call timer_delete_sync() unconditionally to ensure timer callback completes before task is freed. Reliably reproducible by injecting mdelay(10) after setting CANCEL flag to widen the race window, combined with frequent task cancellations (e.g., snapshot automount expiry). Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ameer Hamza <[email protected]> Closes #17942
1 parent 145c606 commit 663dc86

File tree

1 file changed

+24
-7
lines changed

1 file changed

+24
-7
lines changed

module/os/linux/spl/spl-taskq.c

Lines changed: 24 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -635,14 +635,31 @@ taskq_cancel_id(taskq_t *tq, taskqid_t id)
635635

636636
/*
637637
* The task_expire() function takes the tq->tq_lock so drop
638-
* drop the lock before synchronously cancelling the timer.
638+
* the lock before synchronously cancelling the timer.
639+
*
640+
* Always call timer_delete_sync() unconditionally. A
641+
* timer_pending() check would be insufficient and unsafe.
642+
* When a timer expires, it is immediately dequeued from the
643+
* timer wheel (timer_pending() returns FALSE), but the
644+
* callback (task_expire) may not run until later.
645+
*
646+
* The race window:
647+
* 1) Timer expires and is dequeued - timer_pending() now
648+
* returns FALSE
649+
* 2) task_done() is called below, freeing the task, sets
650+
* tqent_func = NULL and clears flags including CANCEL
651+
* 3) Timer callback finally runs, sees no CANCEL flag,
652+
* queues task to prio_list
653+
* 4) Worker thread attempts to execute NULL tqent_func
654+
* and panics
655+
*
656+
* timer_delete_sync() prevents this by ensuring the timer
657+
* callback completes before the task is freed.
639658
*/
640-
if (timer_pending(&t->tqent_timer)) {
641-
spin_unlock_irqrestore(&tq->tq_lock, flags);
642-
timer_delete_sync(&t->tqent_timer);
643-
spin_lock_irqsave_nested(&tq->tq_lock, flags,
644-
tq->tq_lock_class);
645-
}
659+
spin_unlock_irqrestore(&tq->tq_lock, flags);
660+
timer_delete_sync(&t->tqent_timer);
661+
spin_lock_irqsave_nested(&tq->tq_lock, flags,
662+
tq->tq_lock_class);
646663

647664
if (!(t->tqent_flags & TQENT_FLAG_PREALLOC))
648665
task_done(tq, t);

0 commit comments

Comments
 (0)