Performance-Tuning / Identifiing bottlenecks #3404

hh-it-co · 2024-04-30T15:07:20Z

hh-it-co
Apr 30, 2024

I have running nominatim on a VM with 128 GB RAM and 24 Cores. My storage is on a zfs-pool of 8 nvme-ssds, connected via 2 x 10 GBit/s. It takes ~45ms to resolve one address with a structured query against the api. Since i need to resolve initially approx. 6.6 million addresses to geo-coordinates, it would take me approx. 82.5 hours to resolve all addresses. This is not a big deal, but i wonder what seems to be the bottleneck in my setup.
The resolving-script (python) runs in Azure databricks. The cluster in databricks has up to 10 worker nodes available.
In fact, only 2 worker nodes are running the python-script, that fires against the nominatim-api with 500 worker-threads in parallel. More than those 2 worker-nodes will not be used, so the databricks-cluster seems not to be the bottle-neck. My nominatim-VM uses all the 128 GB RAM, what seems to be totaly fine, but the CPU is only used by 5 %. Also the disk- and network-usage is really low. Nominatim is based behind a reverse-proxy, but also the VM that runs the reverse-provy has also really low CPU and network-usage.
So the question is, what limits the throughput? I am totally aware of internet-latency and so on, but parallelization should increase the throughput. But there seems to be a limit.
I found, that increasing the worker-threads in the python-script from 100 to 500 did not increase the throughput. Less than 100 worker-threads seem to lower the thoughput.
So i found, that the postgres-instance is limited to max. 100 parallel connections. After i increased the limit to 500 and restarting the postgres-instance, seemingly nothing has changed on the throughput.
How can i figure out, where the bottleneck is?

mtmail · 2024-05-02T13:03:21Z

mtmail
May 2, 2024
Maintainer

You mention only 5% CPU used and it's a 24 core system. Are there some processes which use 70-100% CPU or are all running processes under 10%? The output of htop under high load should display that.

By 'nominatim-api' you mean you followed https://nominatim.org/release-docs/latest/admin/Deployment-Python/ and it's all HTTP requests. (Ther's also a docker setup and one can use Nominatim as a python library, I just want to check what you use).

5 replies

hh-it-co May 3, 2024
Author

This is during workload.

I followed the Ubuntu 22-Install and yes, i use http-requests.
At first, i tried the kubernetes-install with the helm-chart, but when i tried to load the europe-map, the process was not successfully.

I can't access the frontend, but the rest-api works.

mtmail May 3, 2024
Maintainer

The htop was helpful, I see 4 gunicorn workers. Increase that (the -w parameter), I'd start with 16. Don't forget to run sudo systemctl restart nominatim.service.
Increasing the Postgresql max_connections was a good step. Postgresql will write warnings in its logfile if that needs to increase even more. But 500 is probably good enough.

mtmail May 3, 2024
Maintainer

The Helm Chart looks outdated. It doesn't list the Nominatim version so I guess it defaults to the latest version, but then the webserver configuration does a setup of the PHP frontend which Nominatim 4.4 no longer uses.

hh-it-co May 3, 2024
Author

Thanks for the hint.
I changed to 16 gunicorn-workers. Now i am at ~38ms / record. So it improved a little bit, but the CPU is still not heavily in use.

lonvia May 5, 2024
Maintainer

Have you also increased the parallelism on the client side?

Also, if you want to do bulk geocoding of that magnitude, I strongly suggest that you cut out the HTTP middle man and use Nominatim directly as a library. Split your work into around 20 batches and run each in a different process. Don't use threads, Python's GIL will get in the way.

hh-it-co · 2024-05-07T16:36:17Z

hh-it-co
May 7, 2024
Author

Thanks for your response. I reworked my code in databricks and eliminated threading in favor of spark-parallelization. Subsequently spark uses all available CPU-Cores to do the work. Since the cluster currently only has 16 Cores available (it tries to scale, but the subscription does not allow more nodes), there should be a far better performance possible, but now it takes ~15ms to resolve one address to geo-coordinates over http and the CPU-Usage on nominatim goes to 400%, what is still not that much as long as 2400% are available.
I will try the library you have mentioned.

0 replies

hh-it-co · 2024-05-07T17:03:11Z

hh-it-co
May 7, 2024
Author

Since the library is only for local usage, it would be nice, if there would be an api-endpoint for batch-processing.
Maybe for now, it would be a good solution, to transmit batches as parquet-files or csv-files via scp to the nominatim-server and run a python-script over ssh locally on the nominatim-server that uses the library to process the files. After that, the resulting files may be transmitted back to databricks via scp.
On the other hand, i could create a small python-app, that creates a rest-endpoint, that takes large files or Json-lists, that will be processed by the library...isn't there already a solution like that?

1 reply

mtmail May 7, 2024
Maintainer

The Nominatim-as-library is only useful to have less HTTP requests. If you create a new python-app with rest-endpoint you'd have new HTTP requests so there's no advantage.
The copying of files from server to server sounds too complex and adds new risks of things going wrong, I'd keep the existing setup.

hh-it-co · 2024-05-17T08:48:11Z

hh-it-co
May 17, 2024
Author

Thanks for your suggestion. I will keep the existing setup. It is ok, if the initial load will take about 2 days. The ongoing data ingest will be fast enough.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance-Tuning / Identifiing bottlenecks #3404

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 4 comments 6 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

Performance-Tuning / Identifiing bottlenecks #3404

hh-it-co Apr 30, 2024

Replies: 4 comments · 6 replies

mtmail May 2, 2024 Maintainer

hh-it-co May 3, 2024 Author

mtmail May 3, 2024 Maintainer

mtmail May 3, 2024 Maintainer

hh-it-co May 3, 2024 Author

lonvia May 5, 2024 Maintainer

hh-it-co May 7, 2024 Author

hh-it-co May 7, 2024 Author

mtmail May 7, 2024 Maintainer

hh-it-co May 17, 2024 Author

hh-it-co
Apr 30, 2024

Replies: 4 comments 6 replies

mtmail
May 2, 2024
Maintainer

hh-it-co May 3, 2024
Author

mtmail May 3, 2024
Maintainer

mtmail May 3, 2024
Maintainer

hh-it-co May 3, 2024
Author

lonvia May 5, 2024
Maintainer

hh-it-co
May 7, 2024
Author

hh-it-co
May 7, 2024
Author

mtmail May 7, 2024
Maintainer

hh-it-co
May 17, 2024
Author