-
Notifications
You must be signed in to change notification settings - Fork 1.5k
fix #13333: better handling when source file is repeated in compile_commands.json #7469
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
We should have opt-in/opt-out CLI options so this behavior can be overridden. And it should probably be disabled by default (I feel like there was already a discussion where I was flip-flopping on this). We should also somehow inform the user when a file is omitted because it is a duplicate (within |
Sorry, I overlooked that there is already a message. And some Python tests for this need to be added. |
Something for a future improvement. Some of the settings could be sanitized so they are not different if the order is different. Please do not do this for the defines because they need to be reworked from the bottom up first (simplecpp et al) and that will probably implicitly fix that. |
The hash should be dependent of the order of defines/undefs now. |
What's a scenario where you'd want it to be disabled? |
Where it omits files from the analysis it should not (i.e. bugs). The existing handling of duplicates in other areas has been lacking so there should be a way to disable it. There will be more stuff added to |
So a flag to disable all de-duplication? |
Just for this added logic as it many parts which make up the hash. The existing de-duplication is based on the (real) path being unique so that will only miss de-duplication but not accidentally omit files it should not because it is just a single value to compare. |
cli/cmdlineparser.cpp
Outdated
while (it != fileSettings.end()) { | ||
fileSettings.erase(std::remove_if(std::next(it), fileSettings.end(), [&](const FileSettings& fs) { | ||
if (fs.hash == it->hash) { | ||
mLogger.printMessage("removing duplicate file " + fs.filename()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The word "removing" sounds pretty aggressive. I think "skipping duplicated file entry " or something like that may be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure this should be written unconditionally. It can mess up plugins because the format is not expected.
It does not affect the output so personally I don't understand why we print it out at all. I guess you wanted this @firewave explain why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure this should be written unconditionally.
It should not (I even suggested it to be put into --check-config
- we point out that you should implement such a step in the CI integration guide). BUT as this changes behavior which has been in place for a very long (possible since the import was introduced) and might potentially lead to files wrongly being skipped (given our track record I do not trust any first version of "breaking" changes), it should not be something which happens silently (especially given this is about security).
The messages can be fixed by the user by adjusting the command-line (possibly easy) or the project file (most likely not easy). But having an option to opt-out would also mitigate it.
So I am torn...
That also makes me think if we should provide more extensive filtering of projects as you currently only match the file name.
It can mess up plugins because the format is not expected.
Well - we have been bringing this up a lot but also introduced lots of output which would have broken this in the same time and strangely never gotten a report about that.
I personally had to deal with those in the CLion IDE integration though. So it is actually valid.
This will all be better when these messages are turned into actual findings (but I keep finding other broken stuff).
cli/cmdlineparser.cpp
Outdated
@@ -635,6 +651,9 @@ CmdLineParser::Result CmdLineParser::parseFromArgs(int argc, const char* const a | |||
else if (std::strcmp(argv[i], "--debug-clang-output") == 0) | |||
mSettings.debugClangOutput = true; | |||
|
|||
else if (std::strcmp(argv[i], "--deduplicate") == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel that this word "deduplicate" sounds weird. It sounds generic but it only works for imported projects so how how about : --import-project-recheck-duplicate-entries
. And can we check that it's only used in combination with --project
because otherwise the user is confused?
I would say it should be on by default if there are only downsides and no upsides to turn it off..
@firewave why would anybody turn it off?
we could run cppcheck on all files twice but we don't and there is no flag to turn that on/off.. I don't see why we would do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel that this word "deduplicate" sounds weird.
Something along the lines of --exclude-duplicated-project-files
. As it only applies to FileSettings
it has to mention projects. And if the project does not support it we should bail out.
But this is not just about duplicated files but duplicated "file + configuration" entries. We are currently only allowing input files or a single project to import. After FileWithDetails
and FileSettings
have been merged and the imports have been reworked a bit I think we could possibly support multiple different inputs. So then having "project" in the name would be misleading.
I am probably overthinking this but as we are not very good with deprecating stuff we should look a bit ahead to avoid having stuff we need to support as legacy down the line.
@firewave why would anybody turn it off?
we could run cppcheck on all files twice but we don't and there is no flag to turn that on/off.. I don't see why we would do it.
deduplicate by default
9772903
to
ed50f3b
Compare
No description provided.