Skip to content

Conversation

tonyhutter
Copy link
Contributor

Motivation and Context

Fixes: #17761

Description

The zvol blk-mq codepaths would erroneously send FLUSH and TRIM commands down the read codepath, rather than write. This fixes the issue, and updates the zvol_misc_fua test to verify that sync writes are actually happening.

How Has This Been Tested?

Updated test case. Ran test case on master and saw it fail to send enough sync writes to the log device with blk-mq. Re-ran with this PR and saw it send the expected number of sync writes to log with blk-mq.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Quality assurance (non-breaking change which makes the code more robust against bugs)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

if (op_is_write(req_op(rq))) {
/* Flush & trim operations go down the zvol_write codepath. */
if (op_is_write(rq->cmd_flags) || op_is_flush(rq->cmd_flags) ||
op_is_discard(rq->cmd_flags)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether REQ_FUA | REQ_PREFLUSH flags are really write-specific. Because if they can be combined with reads, you might sent reads into a write pipeline. I see op_is_flush() does not check for absence of other flags.

if (op_is_write(rq->cmd_flags) ||
((op_is_flush(rq->cmd_flags) ||
op_is_discard(rq->cmd_flags)) &&
req_op(rq) != REQ_OP_READ)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

op_is_discard() checks specifically for REQ_OP_DISCARD opcode, so should not have this problem and can be moved out of the parentheses. On the other side op_is_write() seems defined in a way that covers op_is_discard(). The problem may be only with op_is_flush(), since it checks for flags, but not for opcode, but I wonder if what is the semantics here? What opcode do you see when you experience the problem?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the unique opcode + flags combinations I've seen while running zvol_misc_fua.ksh & zvol_misc_trim.ksh:

write: SYNC
write: FUA | SYNC
flush: PREFLUSH
read:
discard: SYNC

I agree that currently a request can't have both discard and write opcodes, but that's not necessarily a guarantee into the future. We only have these op_is_*() functions as an API, unfortunately.

Maybe the logic should be like: "if it's a write, or if it's op_is_sync() and not a read, or if it's a REQ_OP_FLUSH, or if it's a discard, then send down the write path".

The zvol blk-mq codepaths would erroneously send FLUSH and TRIM
commands down the read codepath, rather than write.  This fixes
the issue, and updates the zvol_misc_fua test to verify that
sync writes are actually happening.

Fixes: openzfs#17761
Signed-off-by: Tony Hutter <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

zvol: blk-mq sync isn't working correctly
2 participants