Fix fsync absence when `fsync=N` parameter specified #1799

RomanSofyin · 2024-08-06T15:54:00Z

When fsync=N parameter specified and the final IO block issued happens to be Nth the fsync should follow it is missed. This update checks if it needs to loop again in do_io to issue that final fsync when all the IO work is done.

Signed-off-by: Roman Sofin [email protected]

backend.c

fio.h

io_u.c

RomanSofyin

Addressed the issues.

fio.h

axboe · 2024-08-07T18:15:31Z

Addressed the issues.

Please squash the two commits into one, we should not have fix-up commits in the history. Once you do, I'll take a look again.

RomanSofyin · 2024-08-12T05:55:41Z

Hi @axboe, can you look at this again?

axboe · 2024-08-12T17:24:17Z

You still have the should_fsync() change in there. If you want to have it take a const td, then please do a prep patch with just that. Like I said initially, it's not related to your fix at all outside of you adding a new caller that needs it done. That's by definition a prep patch, should not be included with a fix.

RomanSofyin · 2024-08-13T09:07:07Z

I created PR #1805 which I guess we merge first, then I update the current one in case there will be a conflict.

axboe · 2024-08-13T10:53:51Z

Since the change depends on the other one, please just have both commits in a single pr.

RomanSofyin · 2024-08-13T13:36:22Z

Should be fine now. If so, I'll close #1805.

RomanSofyin · 2024-08-16T15:19:47Z

@axboe does this look fine?

sitsofe

@RomanSofyin Looks good! Some additional comments:

Can you start your commit titles with a section? So for the first commit

io_u: make should_fsync() take a const

Make should_fsync() safer by using a pointer to a constant thread_data
structure instance.

<your signed off by>

and for the second commit just a bit of minor rewording:

io_u/backend: fix missing final fsync() when using fsync=N

When fsync=N is specified and the final I/O issued happens to be the Nth, the 
following fsync() is missed. 

This update checks if it needs to loop again in do_io() to issue that final
fsync when all the IO work is done.

<your signed off by>

io_u.c

Make should_fsync() safer by using a pointer to a constant thread_data structure instance. Signed-off-by: Roman Sofin [email protected]

sitsofe

@vincentkfu, @axboe This LGTM.

RomanSofyin · 2025-08-25T09:27:24Z

I see build-containers (oracle, oraclelinux, 9, x86_64) has failed, is this smth expected?

sitsofe · 2025-08-25T14:54:56Z

@RomanSofyin I kicked off a rebuild of that job and it passed so it seems like it was a transient error.

axboe · 2025-08-25T15:00:50Z

io_u.h

 int do_io_u_sync(const struct thread_data *, struct io_u *);
 int do_io_u_trim(struct thread_data *, struct io_u *);

+bool ddir_is_sync(const struct thread_data *);


I think I'd personally make that static inline, and require that get_sync_ddir() is called only if should_fsync() is true already. That skips a function call in the hot issue path loop for each IO, even if fsync hasn't been set at all.

As you suggested I could call get_sync_ddir() only if should_fsync() is true:

bool ddir_is_sync(const struct thread_data *td) { return should_fsync(td) && get_sync_ddir(td) != DDIR_INVAL; }

But I don't quite get how you want me to make ddir_is_sync static. Since ddir_is_sync gets called from backend.c it needs to be either in backend.c or in io_u.h to have it static. But ddir_is_sync calls should_fsync and get_sync_ddir which are static as well so can't be referred from anywhere but io_u.c.

Can you please clarify this?

@RomanSofyin if you make ddir_is_sync static inline the whole definition can live in the io_u.h header (rather than being in io_u.c) and the whole thing will be copied inline into every place it used (see dprint_io_u() in io_u.h as an example).

Scratch that - your point about about not being able to see should_fsync and get_sync_ddir from the io_u.h header stands.

@RomanSofyin I've chatted to @axboe and it should be sufficient to guard the call sites to ddir_is_sync() by a should_fsync() pre-check.

I was also thinking if it should be a bit nicer condition being wrapped in a macro or inline function. I mean both components together: (should_fsync(td) && ddir_is_sync(td)).

sitsofe · 2025-09-05T07:01:49Z

backend.c

 		    !td->o.read_iolog_file &&
-		    (!td->o.time_based ||
-		     (td->o.time_based && td->o.verify != VERIFY_NONE)))
+		    ((!td->o.time_based && !ddir_is_sync(td)) ||


If we're a time based job won't we have already exited due to the runtime_exceeded check further up? Further, wouldn't we want to avoid checking that we need to add a trailing fsync in the case that we are a non-time based job but still had a non-zero runtime limit (but again wouldn't we have already exited by now)?

runtime_exceeded condition hits if timeout (aka runtime) limit of a job is exceeded, no matter if it's a time based job or not. So if a job gets this line of code, it's allowed to do all its stuff.

As I see this, we can't avoid checking that because for a non-time based (i.e. "size based") job we terminate the loop when the number of bytes requested is exceeded even if there is an fsync call to issue left.

sitsofe · 2025-09-05T07:06:56Z

io_u.h

 int do_io_u_sync(const struct thread_data *, struct io_u *);
 int do_io_u_trim(struct thread_data *, struct io_u *);

+bool ddir_is_sync(const struct thread_data *);


@RomanSofyin I've chatted to @axboe and it should be sufficient to guard the call sites to ddir_is_sync() by a should_fsync() pre-check.

sitsofe · 2025-09-05T07:08:42Z

backend.c


 	while ((td->o.read_iolog_file && !flist_empty(&td->io_log_list)) ||
 		(!flist_empty(&td->trim_list)) || !io_issue_bytes_exceeded(td) ||
+		ddir_is_sync(td) ||


Perhaps move this to be the last case as it's hopefully less common.

When fsync=N is specified and the final I/O issued happens to be the Nth, the following fsync() is missed. This update checks if it needs to loop again in do_io() to issue that final fsync when all the IO work is done. Signed-off-by: Roman Sofin [email protected]

RomanSofyin · 2025-09-11T07:48:36Z

@sitsofe a CI job seems to hang again, can you rerun it please? link

Also would you mind looking at the changes I delivered in 7b7f18c and tell if it's sufficient?

sitsofe · 2025-09-11T19:45:27Z

@RomanSofyin

Also would you mind looking at the changes I delivered in 7b7f18c and tell if it's sufficient?

I'm currently on holiday but I'll take a proper look when they're over.