Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[tests] search for fixed strings, not regexps #3320

Conversation

pmoravec
Copy link
Contributor

grep_for_content should search for a fixed string, not for regexp. The reason is sos is unable to identify 1a2b3c4 as an IP address (which it is not), but grep_for_content would raise false match against 1.2.3.4 .

Resolves: #3320


Please place an 'X' inside each '[]' to confirm you adhere to our Contributor Guidelines

  • Is the commit message split over multiple lines and hard-wrapped at 72 characters?
  • Is the subject and message clear and concise?
  • Does the subject start with [plugin_name] if submitting a plugin patch or a [section_name] if part of the core sosreport code?
  • Does the commit contain a Signed-off-by: First Lastname [email protected]?
  • Are any related Issues or existing PRs properly referenced via a Closes (Issue) or Resolved (PR) line?

@packit-as-a-service
Copy link

Congratulations! One of the builds has completed. 🍾

You can install the built RPMs by following these steps:

  • sudo yum install -y dnf-plugins-core on RHEL 8
  • sudo dnf install -y dnf-plugins-core on Fedora
  • dnf copr enable packit/sosreport-sos-3320
  • And now you can install the packages.

Please note that the RPMs should be used only in a testing environment.

@pmoravec
Copy link
Contributor Author

Particular example where current tests raise false alarms: creating a short-lived VM instance that I set hostname based on its IP address - for IP address, say, 12.34.56.78, I set hostname to ip-12-34-56-78. Then "ip-12-34-56-78" appears in many places in sosreport, and the grep -ril does detect it.

It detects it wrongly. The hostname is not an IP address but my fault to "obviously hide" IP address in a way that sos is totally unable to detect it as IP address, but a human can interpret it that way.

We have three options here:

  • change sos parser to detect "ip-12-34-56-78" as IP address - that sounds insane to me, it is not an IP address
  • change testing to search for fixed strings (still with ignored case) - this can be seen as too relaxed testing, maybe? But I can come up with (so far artificial) much more examples where tests would raise false alarms..
  • leave it as a user mistake "dont obviously hide IP addresses that way"

Still I am in strong favour of the 2nd option / grep fixed string.

@TurboTurtle
Copy link
Member

I'm not entirely following what the issue is.

1.2.3.4 is a valid IP address. And ip-12-34-56-78 wouldn't be the ip parser's responsibility to catch, it'd be the hostname parser which does already look for that fixed string.

@pmoravec
Copy link
Contributor Author

Ah right, I simplified my description. The ip-12-34-56-78 is not current hostname, but some original one we renamed to something else. And the current (short) hostname was properly obfuscated to host0. So for any parser, the ip-12-34-56-78 is just something similar to host's IP address but nothing else. That is why cleaner cant obfuscate it - but tests detect it as "exact" match of IP address.

Is it then rather an infrastructure / setting issue that IP address is such visibly "hidden"? Some opinion from @mhradile ?

@mhradile
Copy link
Contributor

mhradile commented Aug 4, 2023

  1. If cleaner is not supposed to parse something as (ip, hostname), but our test parser does, then it is test error and test needs to be adjusted.
  2. If cleaner is supposed to parse something as (ip, hostname). but only our test does, then it is a cleaner bug and should be fixed.

If I understand it correctly, we are in situation 1. and need to fix test.

Our test API function grep_for_content() takes regex as an argument but instead it is used in tests where we do not want the argument to be interpreted as regex but as fixed string. We are using this function incorrectly in a test and the doc string is incorrect.

I do not think it is wise to make this function to take fixed strings. I believe it will find use cases in its previous form (taking regex).

I'd vote for adding the new function for fixed strings with a different name+docs and use that new one in those tests instead additionaly fixing current function's doc string. Other option would be to parametrize the original function to take both regex and fixed strings according to user selection. If we absolutely do not need a function taking regex (I think we do for future tests), we should rename the current function to reflect it's doc string and intended purpose.

@pmoravec
Copy link
Contributor Author

pmoravec commented Aug 5, 2023

Minimalistic reproducer:

  1. have a file (collected by sos) with string like IP address but dots replaced by minuses:

    ip a | sed "s/./-/g" > /etc/qpidd.conf

  2. ensure qpid plugin is collected by adding to /etc/sos/sos.conf:

    [report]
    enable-plugins = qpid

  3. run test:

    avocado run -t stagetwo --nrunner-max-parallel-tasks=1 tests/cleaner_tests/full_report/full_report_run.py

which fails on:

 (6/7) tests/cleaner_tests/full_report/full_report_run.py:FullCleanTest.test_ip_not_in_any_file: FAIL: IP appears in files: /sosreport-FullCleanTest/sosreport-host0-8675309-2023-08-05-dnckqsb/etc/qpidd.conf (0.25 s)

since captured /etc/qpidd.conf has (for IP address, say, 12.34.56.78):

    inet 12-34-56-78/22 brd 12-34-56-255 scope global dynamic noprefixroute ens192

@TurboTurtle
Copy link
Member

For context, the reason we shell out to grep with this anyways is for performance reasons - it's orders of magnitude faster to use grep to iterate over the archive than to do so in python code here. The method name here is intended as a direct reminder to test writers that we are using grep directly.

With that in mind, and the above, I think @mhradile suggestion of parameter-izing this function is the best path forward. I don't have a specific preference for which is the default behavior - but taking a cursory look I think we just happen to have only fixed-string needs right now, so it'd be no objection to me to use -F by default, and only use regex support if explicitly required.

@pmoravec
Copy link
Contributor Author

pmoravec commented Aug 5, 2023

Ack to that defaults - until I forgotten, I will update the PR.

Change grep_for_content to grep for a fixed string by default.

The reason is tests would match 1a2b3c4 as IP address 1.2.3.4 (which it
is not).

Add regexp option to grep_for_content, to allow the default "grep" search.

Resolves: sosreport#3320

Signed-off-by: Pavel Moravec <[email protected]>
@pmoravec pmoravec force-pushed the sos-pmoravec-tests-grep-fixed-string-not-regexp branch from 018d3a0 to 7eb0ec9 Compare August 6, 2023 08:18
@TurboTurtle TurboTurtle merged commit e98e658 into sosreport:main Aug 17, 2023
30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants