Skip to content

fuse: Fix EIO for writeback when DLM returns -EAGAIN#148

Closed
hbirth wants to merge 3 commits into
DDNStorage:redfs-ubuntu-hwe-6.17.0-16.16-24.04.1from
hbirth:redfs-ubuntu-hwe-6.17.0-16.16-24.04.1
Closed

fuse: Fix EIO for writeback when DLM returns -EAGAIN#148
hbirth wants to merge 3 commits into
DDNStorage:redfs-ubuntu-hwe-6.17.0-16.16-24.04.1from
hbirth:redfs-ubuntu-hwe-6.17.0-16.16-24.04.1

Conversation

@hbirth
Copy link
Copy Markdown
Collaborator

@hbirth hbirth commented Apr 29, 2026

These are some small improvements that are only valid for kernel 6.17

  • don't acquire a dlm lock unless the writeback cache is actually used
  • fix fuse_iomap_read_folio_range() to keep the iomap write op contract

@hbirth hbirth requested review from bsbernd, cding-ddn and yongzech May 1, 2026 09:14
@hbirth hbirth force-pushed the redfs-ubuntu-hwe-6.17.0-16.16-24.04.1 branch from 5a51cf5 to d1b1404 Compare May 1, 2026 11:36
Comment thread fs/fuse/file.c
Comment thread fs/fuse/file.c Outdated
Fix a logical error where the DLM lock was acquired regardless
of whether the writeback part was actually called.
This is necessarry after the move to  iomap_file_buffered_write()

Signed-off-by: Horst Birthelmer <hbirthelmer@ddn.com>
@hbirth hbirth force-pushed the redfs-ubuntu-hwe-6.17.0-16.16-24.04.1 branch from d1b1404 to c94b214 Compare May 1, 2026 12:17
@hbirth hbirth changed the title fuse: don't enable large folios by default fuse: small fixes for Linux 6.17 May 1, 2026
@hbirth hbirth force-pushed the redfs-ubuntu-hwe-6.17.0-16.16-24.04.1 branch from c94b214 to a240f5c Compare May 1, 2026 13:49
Comment thread fs/fuse/file.c Outdated
Comment thread fs/fuse/file.c Outdated
@hbirth hbirth force-pushed the redfs-ubuntu-hwe-6.17.0-16.16-24.04.1 branch 2 times, most recently from 1ab46be to f1485f6 Compare May 1, 2026 15:12
Copy link
Copy Markdown
Collaborator

@cding-ddn cding-ddn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fs/fuse/file.c.rej should not been added into the commit

The comment is incorrect to me.
We should not return EAGAIN to user space, but when the write return, it releases folio lock, why will it be an ABBA deadlock ?

@hbirth hbirth force-pushed the redfs-ubuntu-hwe-6.17.0-16.16-24.04.1 branch from f1485f6 to 449c9a4 Compare May 1, 2026 15:23
hbirth and others added 2 commits May 1, 2026 17:25
fuse_do_readfolio() is called by fuse_iomap_read_folio_range()
as well and is not supposed to return a positive value, thus
the translation has to be done in another layer.

Signed-off-by: Horst Birthelmer <hbirthelmer@ddn.com>
When FUSE server returns -EAGAIN during iomap write operations due to
DLM lock contention, a deadlock can occur:

1. iomap write path holds folio lock and calls fuse_iomap_read_folio_range()
2. FUSE gets -EAGAIN from server (DLM lock conflict with page invalidation)
3. fuse_do_readfolio() converts -EAGAIN to AOP_TRUNCATED_PAGE
4. iomap doesn't understand AOP_TRUNCATED_PAGE and fails the write
5. Meanwhile page invalidation holds DLM lock and waits for folio lock
6. Result: ABBA deadlock

This is a FUSE-only workaround until mainline iomap gains
AOP_TRUNCATED_PAGE retry support. The solution:

1. Add per-task retry tracking in fuse_conn using xarray
2. When fuse_iomap_read_folio_range() sees AOP_TRUNCATED_PAGE:
   - Create retry entry for current task
   - Convert to -EAGAIN for iomap
3. In fuse_cache_write_iter(), after iomap returns error:
   - Check if retry entry exists for current task
   - If yes, erase it and retry the entire write operation

This breaks the deadlock by allowing the folio lock to be released
(per AOP_TRUNCATED_PAGE contract) and retrying at a higher level.

Signed-off-by: Bernd Schubert <bernd@bsbernd.com>
@hbirth hbirth force-pushed the redfs-ubuntu-hwe-6.17.0-16.16-24.04.1 branch from 449c9a4 to 3ffe5e6 Compare May 1, 2026 15:27
@hbirth
Copy link
Copy Markdown
Collaborator Author

hbirth commented May 1, 2026

fs/fuse/file.c.rej should not been added into the commit

The comment is incorrect to me. We should not return EAGAIN to user space, but when the write return, it releases folio lock, why will it be an ABBA deadlock ?

yes I noticed ;-)
fixed now

@bsbernd bsbernd changed the title fuse: small fixes for Linux 6.17 fuse: Fix EIO for writeback when DLM returns -EAGAIN May 1, 2026
@bsbernd bsbernd changed the base branch from redfs-ubuntu-hwe-6.17.0-16.16-24.04.1 to fix-redfs-6.17-iomap-write-dlm-eagain-eio May 1, 2026 20:39
@bsbernd bsbernd changed the base branch from fix-redfs-6.17-iomap-write-dlm-eagain-eio to redfs-ubuntu-hwe-6.17.0-16.16-24.04.1 May 1, 2026 20:39
@bsbernd bsbernd closed this May 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants