-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Fix fsync absence when fsync=N parameter specified
#1799
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
RomanSofyin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed the issues.
Please squash the two commits into one, we should not have fix-up commits in the history. Once you do, I'll take a look again. |
b1c0bb5 to
2b11473
Compare
|
Hi @axboe, can you look at this again? |
|
You still have the should_fsync() change in there. If you want to have it take a const td, then please do a prep patch with just that. Like I said initially, it's not related to your fix at all outside of you adding a new caller that needs it done. That's by definition a prep patch, should not be included with a fix. |
|
I created PR #1805 which I guess we merge first, then I update the current one in case there will be a conflict. |
|
Since the change depends on the other one, please just have both commits in a single pr. |
2b11473 to
d3396c6
Compare
|
Should be fine now. If so, I'll close #1805. |
|
@axboe does this look fine? |
sitsofe
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RomanSofyin Looks good! Some additional comments:
Can you start your commit titles with a section? So for the first commit
io_u: make should_fsync() take a const
Make should_fsync() safer by using a pointer to a constant thread_data
structure instance.
<your signed off by>
and for the second commit just a bit of minor rewording:
io_u/backend: fix missing final fsync() when using fsync=N
When fsync=N is specified and the final I/O issued happens to be the Nth, the
following fsync() is missed.
This update checks if it needs to loop again in do_io() to issue that final
fsync when all the IO work is done.
<your signed off by>
Make should_fsync() safer by using a pointer to a constant thread_data structure instance. Signed-off-by: Roman Sofin [email protected]
d3396c6 to
6078913
Compare
sitsofe
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vincentkfu, @axboe This LGTM.
|
I see build-containers (oracle, oraclelinux, 9, x86_64) has failed, is this smth expected? |
|
@RomanSofyin I kicked off a rebuild of that job and it passed so it seems like it was a transient error. |
| int do_io_u_sync(const struct thread_data *, struct io_u *); | ||
| int do_io_u_trim(struct thread_data *, struct io_u *); | ||
|
|
||
| bool ddir_is_sync(const struct thread_data *); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'd personally make that static inline, and require that get_sync_ddir() is called only if should_fsync() is true already. That skips a function call in the hot issue path loop for each IO, even if fsync hasn't been set at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you suggested I could call get_sync_ddir() only if should_fsync() is true:
bool ddir_is_sync(const struct thread_data *td)
{
return should_fsync(td) && get_sync_ddir(td) != DDIR_INVAL;
}But I don't quite get how you want me to make ddir_is_sync static. Since ddir_is_sync gets called from backend.c it needs to be either in backend.c or in io_u.h to have it static. But ddir_is_sync calls should_fsync and get_sync_ddir which are static as well so can't be referred from anywhere but io_u.c.
Can you please clarify this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RomanSofyin if you make ddir_is_sync static inline the whole definition can live in the io_u.h header (rather than being in io_u.c) and the whole thing will be copied inline into every place it used (see dprint_io_u() in io_u.h as an example).
Scratch that - your point about about not being able to see should_fsync and get_sync_ddir from the io_u.h header stands.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RomanSofyin I've chatted to @axboe and it should be sufficient to guard the call sites to ddir_is_sync() by a should_fsync() pre-check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was also thinking if it should be a bit nicer condition being wrapped in a macro or inline function. I mean both components together: (should_fsync(td) && ddir_is_sync(td)).
backend.c
Outdated
| !td->o.read_iolog_file && | ||
| (!td->o.time_based || | ||
| (td->o.time_based && td->o.verify != VERIFY_NONE))) | ||
| ((!td->o.time_based && !ddir_is_sync(td)) || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're a time based job won't we have already exited due to the runtime_exceeded check further up? Further, wouldn't we want to avoid checking that we need to add a trailing fsync in the case that we are a non-time based job but still had a non-zero runtime limit (but again wouldn't we have already exited by now)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
runtime_exceeded condition hits if timeout (aka runtime) limit of a job is exceeded, no matter if it's a time based job or not. So if a job gets this line of code, it's allowed to do all its stuff.
As I see this, we can't avoid checking that because for a non-time based (i.e. "size based") job we terminate the loop when the number of bytes requested is exceeded even if there is an fsync call to issue left.
| int do_io_u_sync(const struct thread_data *, struct io_u *); | ||
| int do_io_u_trim(struct thread_data *, struct io_u *); | ||
|
|
||
| bool ddir_is_sync(const struct thread_data *); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RomanSofyin I've chatted to @axboe and it should be sufficient to guard the call sites to ddir_is_sync() by a should_fsync() pre-check.
backend.c
Outdated
|
|
||
| while ((td->o.read_iolog_file && !flist_empty(&td->io_log_list)) || | ||
| (!flist_empty(&td->trim_list)) || !io_issue_bytes_exceeded(td) || | ||
| ddir_is_sync(td) || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps move this to be the last case as it's hopefully less common.
When fsync=N is specified and the final I/O issued happens to be the Nth, the following fsync() is missed. This update checks if it needs to loop again in do_io() to issue that final fsync when all the IO work is done. Signed-off-by: Roman Sofin [email protected]
6078913 to
7b7f18c
Compare
I'm currently on holiday but I'll take a proper look when they're over. |
When
fsync=Nparameter specified and the final IO block issued happens to be Nth the fsync should follow it is missed. This update checks if it needs to loop again indo_ioto issue that final fsync when all the IO work is done.Signed-off-by: Roman Sofin [email protected]