-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: unittest for starting router and adding autoLink #880
base: main
Are you sure you want to change the base?
Conversation
@@ -566,6 +566,7 @@ void qd_alloc_finalize(void) | |||
free_stack_chunks(&desc->global_pool->free_list); | |||
free(desc->global_pool); | |||
desc->global_pool = 0; | |||
desc->header = 0; // reset header, so we can initialize again later in qd_alloc (in subsequent test) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually something I think is missing in the dispatch code. Without it, it is not possible to stop the router and start it again, in the same process.
tests/c_unittests/test_add_link.cpp
Outdated
/// unpleasantries (I observed some invalid pointer accesses) | ||
void wait() const { | ||
// todo Can I detect when startup is done? | ||
std::this_thread::sleep_for(std::chrono::milliseconds(500)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't figured this out. If I don't wait, I get crashes. Is there something specific I can wait for, or do I need a random timeout, or maybe for a specific line in log? Initializing dispatch starts new thread(s) and I cannot then call qd_dispatch_free()
before the threads are done initializing.
And @nicob87, what do you thing? |
Some progress, I can give it a map at last
|
@@ -520,6 +524,7 @@ void qd_log_finalize(void) { | |||
qd_log_entry_free_lh(DEQ_HEAD(entries)); | |||
while (DEQ_HEAD(sink_list)) | |||
log_sink_free_lh(DEQ_HEAD(sink_list)); | |||
default_log_source = NULL; // stale value would end up misconfiguring router again in the same process |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Global variable that needs to be NULLed out on exit, otherwise starting again in the same process segfaults on logging.
Its working, its working... more or less. The test to add autoLink through management message really adds an autoLink, and can do it without any network communication! And I cannot reduce scope of the test any further, because then I'd hit static functions and types that are defined in .c files, so I cannot see inside. More work needed, but it starts to look hopeful. |
Faster router startup using BetterRouterStartupLatch to wait for the core thread. It is a bit convoluted, because C function pointers do not work with C++ capturing lambdas (C/C++ does not have dleegates like C# does), but it seems to work much better than the 500ms sleep. New test that tries to send message through router. There is apparently something crucial missing, because I get leaks. I have to yet figure out what I've actually accompished through calling
|
I really like the idea! |
@nicob87 Idea is probably good, but the implementation may not be ideal, yet. Setup for this kind of a test takes about 80 ms, because it starts most of the router, and initializes Python. That may be too slow to initialize new router for every single test. But the point of all this is to have isolated tests. I am not sure how much of C++ features I should be using. It can simplify things, but it is a different language than C, with all the problems that come from that. The uniqe_ptr with deleters looks really nice, if it can be wrapped in something less verbose than what's in the code fragment below. Creating fake messages would also be much easier with a C++ wrapper over the C API. Doing some of the setup is lengthy and it somewhat duplicates the production code. Meaning it would complicate keeping tests in sync with production code. See for example what it looks like if I want to initialize Doing it through Some things cannot be even initialized directly, because there are opaque C structs. In a different type of a test (where I'd include specific C files I need, and not the whole dispatch so library, I could include the needed C files... This can be solved by using the |
afe3c92
to
71e4ae6
Compare
It looks like it's necessary to use I'm not sure what the next step in the test should be, but hopefully I'll make some sense of it later. If there is to be multiple tests like this, I'll need some helper methods to create the links, etc; instead of using the all those methods with 10+ arguments directly. I am still not sure that what I am now trying is the correct level on which to write such a test. |
@@ -371,6 +371,8 @@ void qd_dispatch_free(qd_dispatch_t *qd) | |||
qd_python_finalize(); | |||
qd_dispatch_set_router_id(qd, NULL); | |||
qd_dispatch_set_router_area(qd, NULL); | |||
|
|||
free(qd); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a funny issue; dispatch does not free this, but because there is a global variable pointing to the current qd_router_t, it is not detected as a leak; when I want to start the router multiple times in a single process, subsequent initializations overwrite the global, and the leak gets reported
@@ -972,6 +972,8 @@ void qd_connection_manager_free(qd_connection_manager_t *cm) | |||
config_sasl_plugin_free(cm, saslp); | |||
saslp = DEQ_HEAD(cm->config_sasl_plugins); | |||
} | |||
|
|||
free(cm); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a funny issue; dispatch does not free this, but because there is a global variable pointing to the current value (I am guessing through qd_router_t, but I did not investigate this instance thoroughly), it is not detected as a leak; when I want to start the router multiple times in a single process, subsequent initializations overwrite the global, and the leak gets reported
and with that, c_unittests has passed on this branch, without failures and leaks |
273b722
to
f89cbee
Compare
Here's a microbenchmark for TCP Echo server latency with and without Dispatch
|
Note to self: probably something I introduced here. I never saw this on master https://github.com/apache/qpid-dispatch/runs/1995248602?check_suite_focus=true#step:9:755
|
tests/c_benchmarks/bm_add_links.cpp
Outdated
// if I kill dispatch first, this then may/will hang on socket recv (and dispatch leaks significantly more) | ||
stop_echo_server(); | ||
u.join(); | ||
|
||
qdr.stop_run(); | ||
t.join(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ChugR This actually sounds like your DISPATCH-1972. Lets see if the situation improves when that's fixed.
f89cbee
to
14c9750
Compare
See the 7 other warnings above this one, they come from doctest itself ;( https://travis-ci.com/github/apache/qpid-dispatch/jobs/498901258#L9490
|
Codecov Report
@@ Coverage Diff @@
## main #880 +/- ##
==========================================
- Coverage 84.29% 80.53% -3.77%
==========================================
Files 112 113 +1
Lines 27694 29017 +1323
==========================================
+ Hits 23346 23369 +23
- Misses 4348 5648 +1300
Continue to review full report at Codecov.
|
3ba4879
to
0d52565
Compare
956b2ea
to
f05af53
Compare
…eral and string macro [-Werror=literal-suffix]
…ween literal and string macro [-Werror=literal-suffix]" This reverts commit ade432e
…en literal and string macro [-Werror=literal-suffix]"
2cb224d
to
74c5967
Compare
This is a quick work-in-progress attempt that I've created.
First, majority of functions in the router require that the memory pools are initialized. I've decided to just start the whole router to get that. It may be possible to initialize less of the router, which I will explore later.
Being able to initialize less will be also good for timed unittests, essentially small benchmarks, around for example, measuring how long adding autoLinks takes, and so on.
Starting and stopping router has one small caveat. The way I do it now, if I shut down the router too soon after starting it, things break (I get leaks and/or segfaults). So I had to add a 500ms wait. There has to be a way to detect that startup is done, but I haven't found it yet.
I need to redirect logging in the test. I think it should not be difficult to accomplish, but I did not look into it yet. Looking at the logs is going to be the way to detect problems, because the router code often logs thing right away, and does not return error information in return codes. So this has to be done well, because it will be used often.
Regarding the specific tests, I was thinking about the following issues
This is what the code in the PR prints. The first is from trying to start the broker twice, and the last is from adding a linkRoute. There is no separation between, but it is mostly three times the same thing. Not something that should be printed during regular runs of the test, it's there for now just to show something.