Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

super-ddf: optimize DDF header search for widely used RAID controllers #136

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

slayercat
Copy link

@slayercat slayercat commented Dec 9, 2024

Added logic to check the last 512 bytes of the device (dsize - 512) first
for the presence of the DDF header (DDF_HEADER_MAGIC).

  • If the header is found, the function immediately proceeds without performing a broader search.
  • Validates the magic value after reading the 512 bytes.

Fallback to searching the last 32MB of the device

using the search_for_ddf_headers function when the last 512 bytes do not contain the header.

  • Performs a block-by-block search to locate the DDF header within the specified region.

see also: #135

@mtkaczyk
Copy link
Member

mtkaczyk commented Dec 9, 2024

HI @slayercat
Contribution here requires understanding of kernel coding style.
There are a lot of comments made by our automation, please address them first (make review job passing).

super-ddf.c Outdated
const size_t READ_BLOCK_SIZE = 4096;
const unsigned long long SEARCH_REGION_SIZE = 32 * 1024 * 1024;

// Determine the search range
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For comments, you should use one of:

/*Body*/

/*
 * Body
 */

@ncroxon
Copy link
Collaborator

ncroxon commented Dec 9, 2024

@slayercat slayercat force-pushed the main branch 2 times, most recently from 527767d to 441074a Compare December 10, 2024 15:54
@slayercat
Copy link
Author

Thank you for pointing out the issues with the code style. I have reviewed and modified the submitted code snippets.

@mtkaczyk
Copy link
Member

super-ddf: optimize DDF header search for widely used RAID controllers
**Added logic to check the last 512 bytes of the device (`dsize -
  512`) first**
for the presence of the DDF header (`DDF_HEADER_MAGIC`).
  - If the header is found, the function immediately proceeds without
    performing a broader search.
  - Validates the magic value after reading the 512 bytes.

**Fallback to searching the last 32MB of the device**
using the  `search_for_ddf_headers` function when the last 512 bytes
do not contain the header.
  - Performs a block-by-block search to locate the DDF header
    within the specified region.

See also: https://github.com/md-raid-utilities/mdadm/issues/135

Signed-off-by: lilinzhe <[email protected]>

Kernel style is not only about the code style. You should also follow commit rules, described here:
https://www.kernel.org/doc/html/v4.17/process/submitting-patches.html#describe-your-changes

I think that the description is good but you should divide it into 2 patches. Please do not use markup in commit body, you can keep it in PR description because I'm always cherry-picking commits and PR description is irrelevant.

Sorry, for that! I'm not trying to be nitpicker. As long as mailing list contribution is welcomed we trying to keep similar level of commits quality!

If you are interested, You can explore list online archive to find see comments from kernel folks on mdadm patches (you can search by "mdadm"):
https://lore.kernel.org/linux-raid/

I hope, this exercise will be useful for you to be better prepared for linux kernel contributions :)

Copy link
Member

@mtkaczyk mtkaczyk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello,
I did a quick code review. Please see my comments.

Thanks,
Mariusz

super-ddf.c Outdated
* A dummy update_super function is needed
* to enable the mounting feature.
*/
pr_err("%s: not implemented\n", __func__);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

printing error each time is pointless. Please convert this to debug message dprintf or remove it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for addressing that. I have converted this to use dprintf.

super-ddf.c Outdated
int uuid_set, char *homehost)
{
/*
* A dummy update_super function is needed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I see is mdadm assumes that this function exists for each metadata handler.
This is not about mounting at all but about making mdadm to work reliable with DDF.

It would be nice if you can put a crash (I assume it is crashing) in the commit message.

I would rephrase it:

A dummy function is defined because availability of ss->update_super is not always verified. 

I don't see any clear reference to mounting in this context.
Mounting is handler by kernel when mdadm handles assemblation (preparing of the raid device to be mounted on demand later).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for addressing that. I have added the reason in the commit message and found a related discussion about this in the maillist.

commit e3d493d7046cc07b4eca95a2cc24cc59e0ee3b34
Author: lilinzhe <[email protected]>
Date:   Mon Dec 16 12:00:02 2024 +0800

    super-ddf: Prevent crash when handling DDF metadata
    
    A dummy function is defined because availability of ss->update_super is
    not always verified.
    
    This fix addresses a crash reported when assembling a RAID array using
    mdadm with DDF metadata. For more details, see the discussion at:
    https://lore.kernel.org/all/
    CALHdMH30LuxR4tz9jP2ykDaDJtZ3P7L3LrZ+9e4Fq=Q6NwSM=Q@mail.gmail.com/
    
    The discussion centers on an issue with mdadm where attempting to
    assemble a RAID array caused a null pointer dereference. The problem
    was traced to a missing update_super() function in super-ddf.c, which
    led to a crash in Assemble.c.
    
    Signed-off-by: lilinzhe <[email protected]>

super-ddf.c Outdated
@@ -877,30 +877,170 @@ static void *load_section(int fd, struct ddf_super *super, void *buf,
return buf;
}

static int load_ddf_headers(int fd, struct ddf_super *super, char *devname)

/* Search for DDF_HEADER_MAGIC in the last 32MB of the device */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please copy part of the commit message here, I would like to have clarification why we are not following spec as a comment to. blame is volatile.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for addressing that. I have added the reason to the comments at the beginning of search_for_ddf_headers.

/*
 * Search for DDF_HEADER_MAGIC in the last 32MB of the device
 *
 * According to the specification, the Anchor Header should be placed at
 * the end of the disk. However,some widely used RAID hardware, such as
 * LSI and PERC, do not position it within the last 512 bytes of the disk.
 *
 */
static int search_for_ddf_headers(int fd, char *devname,
				  unsigned long long *out)

super-ddf.c Outdated Show resolved Hide resolved
super-ddf.c Outdated Show resolved Hide resolved
super-ddf.c Outdated Show resolved Hide resolved
super-ddf.c Outdated Show resolved Hide resolved
super-ddf.c Outdated Show resolved Hide resolved
super-ddf.c Outdated Show resolved Hide resolved
super-ddf.c Outdated Show resolved Hide resolved
A dummy function is defined because availability of ss->update_super is
not always verified.

This fix addresses a crash reported when assembling a RAID array using
mdadm with DDF metadata. For more details, see the discussion at:
https://lore.kernel.org/all/
CALHdMH30LuxR4tz9jP2ykDaDJtZ3P7L3LrZ+9e4Fq=Q6NwSM=Q@mail.gmail.com/

The discussion centers on an issue with mdadm where attempting to
assemble a RAID array caused a null pointer dereference. The problem
was traced to a missing update_super() function in super-ddf.c, which
led to a crash in Assemble.c.

Signed-off-by: lilinzhe <[email protected]>
@slayercat
Copy link
Author

slayercat commented Dec 17, 2024

Thank you very much for your prompt response and detailed guidance. I have addressed your comments and corrected the submission accordingly.

If there is anything I may have misunderstood or any other areas that need further adjustments, please let me know.

Additionally, I performed a local CI check, focusing on the unit tests that previously failed in the CI server. Strangely, the tests that failed on the server did not fail locally. However, some tests failed locally, such as 07revert-inplace and 09imsm-assemble. I am not sure if these failures are related to the new submission.

I have attached the output of /var/tmp/07revert-inplace.log for your reference.

++ '[' 11 -le 10 ']'
++ echo 'Reshape has not started after 10 seconds'
Reshape has not started after 10 seconds
++ echo 'Waiting for grow-continue to finish'
Waiting for grow-continue to finish
++ wait_for_reshape_end
++ true
+++ grep -Ec '(resync|recovery|reshape|check|repair) *=' /proc/mdstat
++ sync_action=0
++ ((  0 != 0  ))
+++ pgrep -f 'mdadm --grow --continue'
++ [[ '' != '' ]]
++ break
++ sleep 5
++ wait_for_reshape_end
++ true
+++ grep -Ec '(resync|recovery|reshape|check|repair) *=' /proc/mdstat
++ sync_action=0
++ ((  0 != 0  ))
+++ pgrep -f 'mdadm --grow --continue'
++ [[ '' != '' ]]
++ break
++ echo 100
++ echo 500
++ check raid6
++ case $1 in
++ grep -sq 'active raid6 ' /proc/mdstat
++ die 'active raid6 not found'
++ echo -e '\n\tERROR: active raid6 not found \n'

        ERROR: active raid6 not found

++ save_log fail
++ status=fail
++ logfile=fail07revert-inplace.log
++ cat /var/tmp/stderr
++ cp /var/tmp/log /var/tmp/07revert-inplace.log
++ echo '## localhost: saving dmesg.'
++ dmesg -c
++ echo '## localhost: saving proc mdstat.'
++ cat /proc/mdstat
++ array=($(mdadm -Ds | cut -d' ' -f2))
+++ mdadm -Ds
+++ cut '-d ' -f2
+++ rm -f /var/tmp/stderr
+++ case $* in
+++ case $* in
+++ /sbin/mdadm -Ds
+++ rv=0
+++ case $* in
+++ cat /var/tmp/stderr
+++ return 0
++ '[' fail == fail ']'
++ fail FAILED
++ printf '\033[0;31mFAILED\033[0m'
^[[0;31mFAILED^[[0m++ echo ' - see /var/tmp/07revert-inplace.log and /var/tmp/fail07revert-inplace.log for details'
 - see /var/tmp/07revert-inplace.log and /var/tmp/fail07revert-inplace.log for details
++ '[' loop == lvm ']'
++ '[' loop == loop -o loop == disk ']'
++ '[' '!' -z '' -a 0 -ge 1 ']'
++ echo '## localhost: no array assembled!'
++ exit 2

Implemented fallback logic to search the last 32MB of the device
for the DDF header (magic). If found, proceeds to load the DDF metadata
from the located position.

According to the specification, the Anchor Header should be placed at
the end of the disk. However,some widely used RAID hardware, such as
LSI and PERC, do not position it within the last 512 bytes of the disk.

Signed-off-by: lilinzhe <[email protected]>
@mtkaczyk
Copy link
Member

Thanks @slayercat,
I'm sick now and Christmas time is comming. I will proceed with it after holidays.

@slayercat
Copy link
Author

Take care~

I hope you feel better soon!

Wishing you a joyful Christmas season filled with warmth and happiness. Rest up and take your time.

Looking forward to catching up after the holidays.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants