-
-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Measuring performance bottleneck #78
Comments
If you are doing high latency blocking operations in the event loop you will see this kind of response. Because the core of the event loop for the server is:
It's not quite that simple but that's generally how it fits together. If you are blocking in You need to identify what is the blocking operation, probably a database query, and then decide if If you have blocking operations that you simply can't avoid, you can spin up a thread and use |
Is it the same to start Falcon in hybrid mode to use thread for request handling? Also, I'd like to know how connection pool plays in this part. |
That is a good question. Yes, hybrid mode should give you mostly the same performance characteristics as puma cluster mode. However, ideally you use non-blocking adapters otherwise there are still some cases where you can experience high latency, i.e. if two connections are within the same reactor on the same thread. Process ModelOne parent process spawns N child processes, one reactor per child process. Thread ModelOne parent process spawns N threads, one reactor per thread. GVL contention. Hybrid ModelOne parent process spawns N processes, and each process makes M threads, one reactor per thread. GVL contention, but more threads = better handling of blocking operations. Let me know if you need further clarifications - happy to discuss. |
We use Scout APM to monitoring performance.
It seems Falcon and Puma have different approach handling requests.
Falcon has much higher queue time(yellow part in chart, time before request being processed) and low processing time. Like requests are blocked outside of server to wait for entrance.
Puma has much higher ActiveRecord time(green part in chart) and low queue time.
Both become slow during benchmark test and have similar response time.
Currently we're able to increase Falcon's throughput by using 8 processes for each 4 cpu machine, which originally has only 5 processes.
Is there anyway to probe the situation/bottleneck in Falcon?
The text was updated successfully, but these errors were encountered: