Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueFlow: extracted valueflowForward*() helpers into a separate file #6947

Closed
wants to merge 2 commits into from

Conversation

firewave
Copy link
Collaborator

No description provided.

@firewave
Copy link
Collaborator Author

This is an important step extract the remaining analysis functions.

I thought this would be a much bigger or more complicated change but it turned out it is just a thin layer. That just shows that valueflow.cpp is too big.

I tried to clean this up a bit more so I could be integrated into vf_common.h or so but I kept that until a later date it make things clearer. It is possible some of the future changes already contain some of the logic of the layer which we can clean up later on.

@firewave firewave force-pushed the vf-forward branch 2 times, most recently from 77fdc50 to 3843c54 Compare October 23, 2024 12:25
@danmar
Copy link
Owner

danmar commented Oct 26, 2024

So we have 6000 lines in valueflow.cpp now.

you continue to extract ~100 lines at a time into a new file. that means we will have ~60 more files if you continue like this.

let's revert your work. 100 lines / file is not reasonable.

@firewave
Copy link
Collaborator Author

As stated above without this change it was not even clear that it just that few lines of code. And I want to have this somewhere else - later on.

@pfultz2
Copy link
Contributor

pfultz2 commented Nov 2, 2024

As stated above without this change it was not even clear that it just that few lines of code.

I think it is clear it is not that many lines of code since the forwarding and reverse function was already refactored into the analyzer classes.

@pfultz2
Copy link
Contributor

pfultz2 commented Nov 2, 2024

So we have 6000 lines in valueflow.cpp now.
you continue to extract ~100 lines at a time into a new file. that means we will have ~60 more files if you continue like this.
let's revert your work. 100 lines / file is not reasonable.

Also, having it split across multiple files makes it much harder to find functions. Its kind of a mess right now, with random things pushed into a "common" header instead of having a clear organization. I would rather have it split across 4 different files with high-level themes:

  • value.cpp: core functions, like Value class and setTokenValue function.
  • analyzers.cpp: Analyzer classes and forward and reverse functions
  • lifetime.cpp: Lifetime analysis and passes
  • valueflow.cpp: All other valueflow passes and ValueFlow::setValues

Each file should have way more then ~100 lines.

@firewave
Copy link
Collaborator Author

firewave commented Nov 2, 2024

I would rather have it split across 4 different files with high-level themes:

I think that might 1 or 2 more of them.

I would very much prefer something like that as well and I think we should go that way - but after the current refactoring. Having all those smaller blocks/modules will make it easier to re-arrange them. Abandoning it mid-way would not help things - especially given the current pace at which things are progressing overall.

The current refactoring is almost there. This change is blocking most of the remaining stuff from being split off.

What I do like about the current approach is that we finally would be able to test (and even benchmark) each pass separately. At the moment it is all blackbox testing (with possibly less then optimal coverage) which is not what unit testing should be. It also helped to enforce the usage of some interfaces. It also made the code easier to review in terms of potential cleanups/optimizations (see #6756).

@pfultz2
Copy link
Contributor

pfultz2 commented Nov 2, 2024

I would very much prefer something like that as well and I think we should go that way - but after the current refactoring.

But the current refactoring is going in the wrong direction, and its make it difficult to navigate the code. I would like everything put back into only a few files. You dont even maintain the valueflow part, but you made it harder for the developers who do.

What I do like about the current approach is that we finally would be able to test (and even benchmark) each pass separately.

If you refactored the passes to use the ValueFlowPass class(instead of using the adaptor) than you can register the classes by name and then have a high-level function to grab the pass. This allows keeping everything in the same file.

Although, I dont see the usefulness of running unit tests on an individual valueflow pass. We dont run unit tests like that for other type of passes like the simplification passes.

Also, we already have a mechanism to benchmark each pass as well, so this doesn't improve on that.

@pfultz2
Copy link
Contributor

pfultz2 commented Nov 3, 2024

Although, I dont see the usefulness of running unit tests on an individual valueflow pass. We dont run unit tests like that for other type of passes like the simplification passes.

Actually, a simpler way would be to just add a glob/regex filter to select the passes we want run and then ValueFlowPassRunner can just run the passes that match.

@firewave
Copy link
Collaborator Author

I am quite irked by the late blowback because I am doing exactly what I lined out in the initial commit and people seemed to be fine with the approach. It is not like I suddenly started doing some totally different.

Since the objections raised were not obvious at the beginning of this I would spin this as an argument why the refactoring should be done and stay the way it is as it would indicate nobody had any idea how all the code into the monolithic files actually depends on each other.

But the current refactoring is going in the wrong direction, and its make it difficult to navigate the code. I would like everything put back into only a few files.

I already seconded that but I would like to finish the refactoring and incorporate all the cleanups/optimizations I amassed. The harm is already done and at least part of the refactoring is good (like more enforcement of interfaces and less visible implementations - i.e. infer, analyzers - conditions would still be ahead).

And I personally find it harder to jump around a multi-thousand lines file (and insight, code navigation in IDEs are also sometimes not too happy if you do modifications to such a file - including Cppcheck) instead of a more focused one with just some functionality outside of it. But as pointed out I am not the maintainer of it and do not have to do the heavy lifting with it. And I think the proposal of re-grouping things makes a lot of sense.

You dont even maintain the valueflow part, but you made it harder for the developers who do.

I just try to stay away from the actual implementations/logic because I do not understand what they are doing. I try to limit things what is just the basic programming stuff. And let's be realistic - it is developer (singular).

Although, I dont see the usefulness of running unit tests on an individual valueflow pass.

I do see usefulness - but more important is if it makes sense. And it does not because the passes are depended on each other so testing them separately would just be wrong.

A case were it would actually sense would the various checks were we most do blackbox testing which leads to drive-by detections. But that is a can of worms I will have to deal with in a different context.

We dont run unit tests like that for other type of passes like the simplification passes.

That is a bad example because that is exactly what we do in the unit tests - and we shouldn't. That's what my work of getting rid of the custom tokenizer implementations in the test code is all about. We also (ab)use the tokenizer for markup files and conditions in Visual Studio projects and that it causing issues and horrible code because we split functionality/passes that cannot be split since it depends on each other. Something I have not looked into.

@pfultz2
Copy link
Contributor

pfultz2 commented Nov 12, 2024

Since the objections raised were not obvious at the beginning of this I would spin this as an argument why the refactoring should be done and stay the way

Seeing that the refactoring's organization is a mess, is not an argument for keeping it a mess. That makes no sense.

it would indicate nobody had any idea how all the code into the monolithic files actually depends on each other.

The problem is that there is a lot of code that is shared across passes. Now we have a common header file filled with random things. Usually a "common" header is a sign of poor organization of the code. Even more so, there is auxiliary functions that are used by passes that are not exposed in the common header, but we could consider reusing such function in the future but we will need to figure out which pass was using that function to try and reuse it. It makes development much harder.

Perhaps there is a way to organize this in a way that makes sense and avoid the common header file, but it will take someone who really understands the code.

I already seconded that but I would like to finish the refactoring

Thats easy for you to say since you dont do development on valueflow.

I just try to stay away from the actual implementations/logic because I do not understand what they are doing.

And that is the crux of the problem, you dont really understand the code nor do you attempt to understand it. I dont think that is good for doing a big reorg of the code.

@firewave
Copy link
Collaborator Author

Seeing that the refactoring's organization is a mess, is not an argument for keeping it a mess. That makes no sense.

I kept doing what I did in the very first step and that already included the "common" code. The initial PR was open for 15 months with another (albeit very brief) discussion beforehand in a different ticket so it is not something that was thrown into the repo over night. In this comment you were even fine with putting each pass into a separate file: #4642 (comment).

And that is the crux of the problem, you dont really understand the code nor do you attempt to understand it. I dont think that is good for doing a big reorg of the code.

That is your opinion. I did attempt to understand it and trying to do that is also what contributed to why this all has taken a long (aside from my health issues and the painstakingly long time reviews take nowadays). But the current organization makes it extremely hard to understand since there are so much different parts to the valueflow (the groupings you outlined in an earlier comment).

Also there is so much other stuff which is so tangled up and hard to understand that the time to spend on each is limited. Even on things I do think I finally understand I have to go back and forth on multiple times to be sure (and lack of test coverage is not helping either) which is causing things to drag.

@pfultz2
Copy link
Contributor

pfultz2 commented Nov 13, 2024

#6991 will revert moving all the passes moving out, but it still keeps other parts of the refactoring such as moving the analyzer classes, and setTokenValue to seperate cpp files. From there, we can move the condition handler to a seperate file. I think this is a good compromise.

@firewave
Copy link
Collaborator Author

Will have a look before the weekend.

@firewave
Copy link
Collaborator Author

#6991 will revert moving all the passes moving out, but it still keeps other parts of the refactoring such as moving the analyzer classes, and setTokenValue to seperate cpp files. From there, we can move the condition handler to a seperate file. I think this is a good compromise.

Yes, keeping/pulling out the big blobs and processing on from there (if) makes sense.

I will drop all my pending local changes except for the condition handler stuff.

What about #6922? Should we still proceed on that?

@firewave
Copy link
Collaborator Author

Closing as this work is no longer being pursued.

@firewave firewave closed this Nov 25, 2024
@firewave firewave deleted the vf-forward branch November 25, 2024 11:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants