-
Notifications
You must be signed in to change notification settings - Fork 543
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[scsi-block-nvme] move per device data to #3201
Conversation
Through this patch, I would like to propose moving data collected for "per device" to sub directories. With 100s of devices, single directory gets cluttered and it takes time to search the appropriate commands. With this patch, tried naming the sub directories based on command output they collect OR as logically as possible. Data collected looks more neat and commands are easy to use. Thank you, |
Congratulations! One of the builds has completed. 🍾 You can install the built RPMs by following these steps:
Please note that the RPMs should be used only in a testing environment. |
_subdir1 = "udevadm_block" | ||
_subdir2 = "fdisk_parted" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is better to name the variables by their usage, e.g. fdisk_dir
or similar.
parted_cmds = [ | ||
"parted -s %(dev)s unit s print" | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the list contains just one item, we can directly use the command "parted -s .."
in the add_device_cmd
, like we do for fdisk -l ..
few lines below, no?
sos/report/plugins/nvme.py
Outdated
@@ -26,6 +26,7 @@ class Nvme(Plugin, IndependentPlugin): | |||
kernel_mods = ('nvme', 'nvme_core') | |||
|
|||
def setup(self): | |||
_subdir = "smart_ns_ctrl_fw_error_regs" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The subdirectory name is human-unreadable (at least for me) - if I would see that name in a directory structure, I won't guess what it contains. Maybe if we want to store outputs of per block devices commands, name the directory devices
or block_devices
? See other usages of that: https://github.com/sosreport/sos/search?q=%22subdir%3D%22.
Also, name of the variable shall be more informative - or since you use it at one place only, you can set subdir="..."
directly there (while I agree this way follows the good approach of listing any plugin's constants at its beginning).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'm personally caught up on the subdirectory structure.
I don't mind adding subdirs here in general, but the question to me is do we make these per-command subdirs or per-device? An analogous use case in sos currently is containers throughout several plugins, or perhaps moreso the different resources that OpenShift supports. The current theme has been to make subdirs based on resources (in this case, devices) rather than commands.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the review.
Directories are being created for commands that run on every device available in the system.
For example "nvme list-ctrl %(dev)s" is run for all available nvme devices.
For 1216 nvme devices, there will be more than 10k files in the same directory.
-
$ less proc/partitions |grep nvme |wc -l
1216 -
$ ls sos_commands/nvme/ |wc -l
10946
We do not need to inspect each and every file of the command that is run for all the devices. Only data specific to one or few devices is needed.
Without this patch, for 1216 devices the output looks like - https://people.redhat.com/nyewale/internal/nvme_commands_in_sos.txt
With the proposed patch, the output would always look like
- $ls sos_commands/nvme/
modinfo_nvme modinfo_nvme_core nvme_list nvme_list-subsys smart_ns_ctrl_fw_error_regs
and the commands that are run on every device will collect the output in smart_ns_ctrl_fw_error_regs directory.
Data in this directory would be huge though for a system with 100s of devices.
This is how it looks for 4 devices
-
$ grep nvme proc/partitions
259 0 83886080 nvme0n1
259 1 251658240 nvme0n2
259 2 41943040 nvme0n3
259 3 536870912 nvme0n4 -
$ ls sos_commands/nvme/smart_ns_ctrl_fw_error_regs/
nvme_error-log_.dev.nvme0n1 nvme_id-ctrl_-H_.dev.nvme0n3 nvme_list-ns_.dev.nvme0n1 nvme_smart-log_.dev.nvme0n3
nvme_error-log_.dev.nvme0n2 nvme_id-ctrl_-H_.dev.nvme0n4 nvme_list-ns_.dev.nvme0n2 nvme_smart-log_.dev.nvme0n4
nvme_error-log_.dev.nvme0n3 nvme_id-ns_-H_.dev.nvme0n1 nvme_list-ns_.dev.nvme0n3 smartctl_--all_.dev.nvme0n1
nvme_error-log_.dev.nvme0n4 nvme_id-ns_-H_.dev.nvme0n2 nvme_list-ns_.dev.nvme0n4 smartctl_--all_.dev.nvme0n1_-j
nvme_fw-log_.dev.nvme0n1 nvme_id-ns_-H_.dev.nvme0n3 nvme_show-regs_.dev.nvme0n1 smartctl_--all_.dev.nvme0n2
nvme_fw-log_.dev.nvme0n2 nvme_id-ns_-H_.dev.nvme0n4 nvme_show-regs_.dev.nvme0n2 smartctl_--all_.dev.nvme0n2_-j
nvme_fw-log_.dev.nvme0n3 nvme_list-ctrl_.dev.nvme0n1 nvme_show-regs_.dev.nvme0n3 smartctl_--all_.dev.nvme0n3
nvme_fw-log_.dev.nvme0n4 nvme_list-ctrl_.dev.nvme0n2 nvme_show-regs_.dev.nvme0n4 smartctl_--all_.dev.nvme0n3_-j
nvme_id-ctrl_-H_.dev.nvme0n1 nvme_list-ctrl_.dev.nvme0n3 nvme_smart-log_.dev.nvme0n1 smartctl_--all_.dev.nvme0n4
nvme_id-ctrl_-H_.dev.nvme0n2 nvme_list-ctrl_.dev.nvme0n4 nvme_smart-log_.dev.nvme0n2 smartctl_--all_.dev.nvme0n4_-j
With 100s of devices we tend to overlook the important commands shown above.
Having a separate directory wherever possible for each command is also a good idea. Creating a separate directory for the two udevadm commands may not be good idea though.
Considering the directory structure on case-by-case basis, as needed would surely help I believe.
Thank you,
_subdir1 = "udevadm_scsi" | ||
_subdir2 = "sg_persists_inq" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise above, variable names and value names should be more intuitive.
In general, the changes make sense. Maybe some other plugins iterating over devices deserve the same? Not sure if some tools like Insights dont rely on some particular files location we break by this change (but such important files should have a tag already). |
Thank you for the review. I shall make the mentioned changes. Yes, there are other plugins as well which collect per device data in the same directory and separating it would look the more understandable. One such plugin is network. I can try and check for other plugins.
|
sos/report/plugins/block.py
Outdated
_subdir1 = "udevadm_info_block" | ||
_subdir2 = "fdisk_and_parted" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For code readiness, could be the variable names be more descriptive, like e.g. udevadm_dir
or fdisk_dir
?
sos/report/plugins/block.py
Outdated
subdir=_subdir1) | ||
self.add_device_cmd("parted -s %(dev)s unit s print", | ||
blacklist="ram.*", devices="block", | ||
tags="parted_s_sos", subdir=_subdir2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My first thought was "why the tag has redundant _sos
substring, it is evident the tag belongs to sos
- but we already had fdisk_l_sos
tag so I am OK,
sos/report/plugins/nvme.py
Outdated
_subdir1 = "nvme_smartctl" | ||
_subdir2 = "nvme_fw_smart_error_logs" | ||
_subdir3 = "nvme_list-ns" | ||
_subdir4 = "nvme_list-ctrl" | ||
_subdir5 = "nvme_id-ctrl_-H" | ||
_subdir6 = "nvme_id-ns_-H" | ||
_subdir7 = "nvme_show-regs" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise above, would it be possible to name variables more understandably? (i.e. list_ctrl_dir
)?
sos/report/plugins/scsi.py
Outdated
_subdir1 = "udevadm_info_scsihosts" | ||
_subdir2 = "sg_inq_devices" | ||
_subdir3 = "sg_persist_devices" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto variable names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My first thought was "why the tag has redundant _sos substring, it is evident the tag belongs to sos - but we already had fdisk_l_sos tag so I am OK,
For code readiness, could be the variable names be more descriptive, like e.g. udevadm_dir or fdisk_dir?
Ditto variable names.
Not added _dir for nvme or other commands due to similar reasons as we know that it's a directory. OR shall I change it similar to list_ctrl_dir. Actual command is "nvme list-ctrl" hence there is nvme and its not because of the plugin. Could you please consider giving another thought on the same.
sos/report/plugins/nvme.py
Outdated
self.add_device_cmd(smartctl_cmds, devices='block', whitelist='nvme.*', | ||
subdir=_subdir1) | ||
self.add_device_cmd(nvme_logs_cmds, devices='block', | ||
whitelist='nvme.*', subdir=_subdir2) | ||
self.add_device_cmd("nvme list-ns %(dev)s", whitelist='nvme.*', | ||
devices='block', subdir=_subdir3) | ||
self.add_device_cmd("nvme list-ctrl %(dev)s", whitelist='nvme.*', | ||
devices='block', subdir=_subdir4) | ||
self.add_device_cmd("nvme id-ctrl -H %(dev)s", whitelist='nvme.*', | ||
devices='block', subdir=_subdir5) | ||
self.add_device_cmd("nvme id-ns -H %(dev)s", whitelist='nvme.*', | ||
devices='block', subdir=_subdir6) | ||
self.add_device_cmd("nvme show-regs %(dev)s", whitelist='nvme.*', | ||
devices='block', subdir=_subdir7) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An option that can lead to more concise and readable code:
for cmd, subdir in [(smartctl_cmds, _subdir1),
(nvme_logs_cmds, _subdir2),
..
("nvme show-regs %(dev)s", _subdir7)]:
self.add_device_cmd(cmd, whitelist='nvme.*', devices='block', subdir=subdir)
It is a matter of subjective opinion what is better here, imho, and I am OK with either approach - leaving on the author to decide.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Changes makes sense and will help reduce the number of lines of code and will look meaningful. Thank you for the suggestion. Will make the necessary changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for the delay. I have made the necessary changes and shall send the modified path by tomorrow.
Thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @pmoravec, I have pushed changes to https://github.com/Nitn-Yewale/sos. Could you please check. Please let me know if any action item is on me.
I don't see my recommendations (i.e. directory variable names) followed at all..? (doing so, you will also get rid of the Flake errors https://github.com/sosreport/sos/pull/3201/checks?check_run_id=17389330583 ;-) ) Instead of having a merge commit for rebasing your fork, please do the following:
|
block and nvme commands Signed-off-by: Nitin U. Yewale <[email protected]>
Hello @pmoravec , thank you for all the guidance so far. Could you please check now. Regards, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is in a much better state with the condensed dirs, however I still am not a fan of the variable naming being so opaque.
For example, in scsi
, using udevadm_dir
instead of _subdir1
is much more informative and obvious what we're doing to anyone looking at this in the future.
I still have some concern around making the directories per-command instead of per-device, but I'm willing to defer here on that and see how things evolve going forward.
Thank you @pmoravec @TurboTurtle |
subdirectories for storage commands
Please place an 'X' inside each '[]' to confirm you adhere to our Contributor Guidelines