Fuzzing Coverage Expansion #407

nmlsg · 2023-09-01T14:51:33Z

This pull requests expands fuzzing coverage for oss-fuzz with new fuzzers and additional corpuses.

dillof · 2023-09-01T15:31:40Z

Thanks for improving our fuzz setup, we very much appreciate it.

I haven't looked at it in detail yet, but could you please put the fuzzer specific files in their own directory? I would like to keep the regress directory for our test suite only. I would prefer a new top level directory, but if it's easier to integrate with the fuzz infrastructure, a subdirectory under regress is also acceptable.

nmlsg · 2023-09-01T16:03:02Z

Hi @dillof
It will be a problem because in such way I will need to make a commit to oss-fuzz directory in order to re-write a Dockerfile

But I don't have any access to oss-fuzz repo.
You could accept a PR, and then redesign the folder structure ( If you have an access to oss-fuzz repo ).

nmlsg · 2023-09-06T13:44:23Z

Hi @dillof
I see that PR is still open, do you need something specific from my side?
Let's discuss it !

0-wiz-0 · 2023-09-14T20:14:24Z

This pull request needs major cleanups.

Please put the new fuzzer code in regress, that's fine, but not top-level.
don't move files around in the fuzzer shell script
you seem to add a lot of corpus data, but I only see it used by being zipped up in one file in the fuzzer shell script. where are they used?
if the corpus data is really used, it should either be just added as (one to four) zip files directly to the repository, or if there's a reason the data needs to be there unpacked as well, it should be in a subdirectory of regress e.g regress/corpus, and the fuzzer shell script adapted as necessary
Please add some description what the new fuzzers want to test, as comments in the source code.
Thanks!

nmlsg · 2023-09-14T20:32:50Z

Hi @0-wiz-0,

@dillof asked me to create a new subdirectory under 'regress.' As a result, I have updated the shell script and also created a new subdirectory. Just to confirm, is it okay to move all the fuzzers directly to the 'regress' directory in this repository?

I'm using corpuses as initial seeds for the fuzzers. According to the oss-fuzz documentation, the recommended approach is to zip the corpuses with the specific name 'name_seed_corpus.zip.' In the initial setup of the repository, you had created a zip file with all the corpuses in shell script, however I can archive them once and upload the prepared archive if you would like me to.

0-wiz-0 · 2023-09-15T08:26:24Z

@nmlsg Please take a step back and consider the existing project, not just the fuzzer part.
libzip has a regression suit that can be run manually or in an automated way, which is done regularly by the CIs.
The tests and their data exist in the regress directory.
So the existing zip archives are crafted to be used by specific tests to cover particular functionality.
These files must be left available for the tests.

The previous fuzzer setup just took advantage of the existing test data and used that.

You added test data that is not zip archives (like pdf, php, js, xml), I'm not quite sure why (because libzip is not handling any of these formats), and put it unpacked in the regress directory. We saw that that as noise from the point of view of regression testing and asked you to put it in a subdirectory, or pack it up.

Your latest change zipped them up, but also removed the test data we use for regression testing.

Please, for future changes, make sure that the test suite runs through successfully.

Here's an updated list of requests so this can be merged:

restore the test data so the test suite can run again
document the new tests
remove the c source code you added to the top-level
Thank you.

nmlsg · 2023-09-15T09:10:57Z

@0-wiz-0 I have cleaned up the 'regress' directory, removed 'C' source code from the top-level directory, restored the test data, and thoroughly documented the targets for the new fuzzers. Please take a moment to review these changes. Thank you.

0-wiz-0 · 2023-09-15T09:20:07Z

Thank you, that's better.

I still see 4 four new toplevel *.cc files
bigstored.zh was removed, please restore it
some of the corpus zip files contain other corpus zip files, which looks like it was an accident - can you please check?

Thanks.

nmlsg · 2023-09-15T09:42:11Z

@0-wiz-0 Thanks for your time ! I have pushed the changes, check it please !

The corpus files, for example, 'zip_write_encrypt_aes256_file_fuzzer_seed_corpus.zip,' consist of multiple files that serve as the initial seeds for the fuzzer. Similarly, 'zip_read_encrypted_file_fuzzer_seed_corpus.zip' contains various zip archives, each containing encrypted files.

0-wiz-0 · 2023-09-15T11:54:08Z

Thank you for the changes.
As to the code:

I don't understand the changes in zip_read_fuzzer - are they intended? Can you please explain why you create an archive and immediately remove it?
The file name says AES 256, but the code is for AES 128
I think when we're fuzzing the encryption, it would be good to also use a random password each time, right? (that's for both AES and pkware fuzzers)
For zip_read_encrypted_file_fuzzer: I think this will fail a lot - if it fuzzes the input data, it will not be valid AES encrypted data and then decryption will fail. But perhaps that's the point?
The corpus files for the other fuzzers are included in the corpus file for the encryption fuzzer. Is that really useful?

nmlsg · 2023-09-15T12:11:32Z

To cover the 'zip_open' function, I have utilized I/O operations to create an archive and subsequently remove it, enabling the loading of data via the 'zip_source_buffer_create' function.
Acknowledge my mistake and will rectify it by changing it to 'ZIP_EM_AES_256'. I conducted experiments involving various encryption algorithms and key lengths ;)
It seems like a good idea; I can implement random password generation into the process.
I have pre-generated valid archives with encrypted files inside. Although mutations from the oss-fuzz side may lead to potential failures, we have implemented handling mechanisms to prevent false-positive crashes.
It's essential to experiment with different data and file types to observe their behavior, so yes, we have to add all of theese file "types"

0-wiz-0 · 2023-09-15T17:52:23Z

Thank you for the improvements.

The corpus data for the AES tests is AES128 encrypted. Shouldn't this match the code and be AES256? (How do you test your commits?)
Is the corpus data available under a free license? Where did you get the files? (js, xml, pdf, ...)
The AES corpus data zip file contains the complete corpus zip files of two other corpora. That looks wrong to me - it looks like an oversight, but if it's not, it at least wastes space. If we really want to include the corpus files inside this zip archive, we could add them to the zips in the fuzzer shell script.
Same problem for the pkware corpus file.
I still don't understand the changes to zip_read_fuzzer. The newly added code does something completely different than the existing code there. If you think it's useful, it should be a separate fuzzer cc file.

nmlsg · 2023-09-15T20:07:47Z

I appreciate your comments, and I'm thankful for your time and efforts.

Regarding the '_zip_get_encryption_implementation' function, it consistently returns 'zip_source_winzip_aes_decode' or 'zip_source_winzip_aes_encode' for all key lengths related to AES encryption. I've conducted local tests and also have a deployed instance of oss-fuzz on my virtual machine with the repository URL modified to 'nmlsg/libzip.'
I generated some of the corpuses manually, while others were obtained from this repository repo_link.
I acknowledge the issue you mentioned, and I'm addressing it in a new commit. It appears that something went wrong with my use of 'xargs,' and I've made the necessary corrections.
Regarding the method of opening and handling archives by writing data to disk from a fuzzer, this approach is designed to cover the 'zip_open' function (zip_read_fuzzer target) . It seems reasonable to leave it as is, but if you believe it's better to create another target for this, I'm open to doing so. This method effectively covers a substantial portion of the codebase.

0-wiz-0 · 2023-09-16T19:41:12Z

Thank you for the information. Wow, you did set up your own fuzzing instance!

About the test data - I looked at the repository and it doesn't have any license information as far as I can see - so we can't use it. I think for the purposes of libzip fuzzing, two cases are probably enough:

use completely random input data
use ASCII random input data (so only characters from bytes 32-127)
I think that should cover enough for our purposes.

As for zip_read_fuzzer, I think I understand what you mean now - the new code tests if the file data that is provided can be read using zip_open() if written to a file.
However, the existing (previous) code is just run as is, and duplicates part of this.
I think it would be nicer to leave zip_read_fuzzer.cc as is, and add a zip_read_file_fuzzer, that's basically the same (including reading all the data) but does it after writing the file data to disk, like your new code does. I.e. at the beginning, write the file data to disk, then use the loop from zip_read_fuzzer that reads all entries, and finally zip_close the file and remove it.

Thanks!

0-wiz-0 · 2023-09-16T19:41:49Z

Oh, and please take a look at the CIFuzz failure in the last run for the pull request. Are these alrady libzip fuzzing issues or problems in the setup?

nmlsg · 2023-09-17T14:29:27Z

Hi @0-wiz-0
Ihave recently made several modifications, including the addition of a new fuzzer called 'zip_read_file_fuzzer,' and have also addressed minor issues in existing fuzzers.

Based on my evaluation, the changes appear to be functioning as expected both locally and on my deployed machine, and there are no indications of issues with the ossfuzz setup in this pull request.

I am considering the generation of new corpora that exclusively contain ASCII characters to replace the current ones. Would you like me to do it?

nmlsg · 2023-09-17T14:54:23Z

@0-wiz-0 I have implemented all requisite modifications. Kindly review them at your earliest convenience.

0-wiz-0 · 2023-09-17T18:43:02Z

There were some minor things I cleaned up myself.
I've now merged your pull request, thank you for your work on this!
Looking forward to the new ossfuzz reports :)

nmlsg · 2023-09-17T18:50:59Z

@0-wiz-0 Thanks for your work, I really appreciate that !

nmlsg · 2023-09-21T12:01:01Z

Hello @0-wiz-0,

I've observed that there might be an issue with the recent commit you made, which seems to have caused a breakdown in the OSS-Fuzz setup. Could you please consider rolling back this commit? Here is the latest log for reference: Link to Log. *Step #3 - "compile-libfuzzer-coverage-x86_64": [1m/src/libzip/regress/fuzzers/zip_write_encrypt_aes256_file_fuzzer.cc:70:1: [0m�[0;1;31merror: �[0m�[1mdeclaration of 'random_string' has a different language linkage�[0m

I've also taken the initiative to test the setup locally with actual repository , and it has failed. Interestingly, when I tested my fork of the repository link, it worked perfectly fine, I assume that your last commit affected the whole setup.

0-wiz-0 · 2023-09-21T22:04:51Z

I've fixed the random_string issue - the problem was that the prototype needed to be inside the extern "C" block.
Thanks for pointing out the error.

I have a local change that builds the fuzzers during a libzip build to find issues such as this one in the future, but this now needs a C++20 compiler, so I can't commit it as-is (because libzip is a C library). It would be nicer if the fuzzers were written in C instead of C++.

nmlsg added 2 commits September 1, 2023 16:16

new fuzzers and corpuses

668aeb6

added standart corpuses for decryption

f2fbd1e

new folder structure

44ffff7

nmlsg added 2 commits September 14, 2023 23:09

redesigned oss-fuzz setup structure

ce53e5a

added corpuses as a separate zip archive

5790cf1

improved oss-fuzz setup

3d14bf8

removed unnecessary archive

2430ff8

bigstored.zh was restored

87bc7ce

nmlsg force-pushed the main branch from 31016f0 to 87bc7ce Compare September 15, 2023 09:36

fuzzer targets from the top-level were removed

da37e93

minor fixes + random password generation was implemented

f0c5e22

Minor cleanups in corpora.

81b794e

nmlsg added 3 commits September 17, 2023 13:53

new fuzzer was added + project structure was redesigned

7cd4db5

ossfuzz.sh was changed with respect to the new fuzzer

5ccb3e8

zip_read_file_fuzzer was fixed

47757f0

fixes in zip_read_file_fuzzer

da7e443

new corpuses

f08a773

changes in corpora

28c5db8

0-wiz-0 merged commit 28c5db8 into nih-at:main Sep 17, 2023
0 of 2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fuzzing Coverage Expansion #407

Fuzzing Coverage Expansion #407

nmlsg commented Sep 1, 2023

dillof commented Sep 1, 2023

nmlsg commented Sep 1, 2023 •

edited

Loading

nmlsg commented Sep 6, 2023

0-wiz-0 commented Sep 14, 2023

nmlsg commented Sep 14, 2023

0-wiz-0 commented Sep 15, 2023

nmlsg commented Sep 15, 2023

0-wiz-0 commented Sep 15, 2023

nmlsg commented Sep 15, 2023

0-wiz-0 commented Sep 15, 2023

nmlsg commented Sep 15, 2023 •

edited

Loading

0-wiz-0 commented Sep 15, 2023 •

edited

Loading

nmlsg commented Sep 15, 2023

0-wiz-0 commented Sep 16, 2023

0-wiz-0 commented Sep 16, 2023

nmlsg commented Sep 17, 2023 •

edited

Loading

nmlsg commented Sep 17, 2023

0-wiz-0 commented Sep 17, 2023

nmlsg commented Sep 17, 2023

nmlsg commented Sep 21, 2023 •

edited

Loading

0-wiz-0 commented Sep 21, 2023

Fuzzing Coverage Expansion #407

Fuzzing Coverage Expansion #407

Conversation

nmlsg commented Sep 1, 2023

dillof commented Sep 1, 2023

nmlsg commented Sep 1, 2023 • edited Loading

nmlsg commented Sep 6, 2023

0-wiz-0 commented Sep 14, 2023

nmlsg commented Sep 14, 2023

0-wiz-0 commented Sep 15, 2023

nmlsg commented Sep 15, 2023

0-wiz-0 commented Sep 15, 2023

nmlsg commented Sep 15, 2023

0-wiz-0 commented Sep 15, 2023

nmlsg commented Sep 15, 2023 • edited Loading

0-wiz-0 commented Sep 15, 2023 • edited Loading

nmlsg commented Sep 15, 2023

0-wiz-0 commented Sep 16, 2023

0-wiz-0 commented Sep 16, 2023

nmlsg commented Sep 17, 2023 • edited Loading

nmlsg commented Sep 17, 2023

0-wiz-0 commented Sep 17, 2023

nmlsg commented Sep 17, 2023

nmlsg commented Sep 21, 2023 • edited Loading

0-wiz-0 commented Sep 21, 2023

nmlsg commented Sep 1, 2023 •

edited

Loading

nmlsg commented Sep 15, 2023 •

edited

Loading

0-wiz-0 commented Sep 15, 2023 •

edited

Loading

nmlsg commented Sep 17, 2023 •

edited

Loading

nmlsg commented Sep 21, 2023 •

edited

Loading