-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add max_retries
for requests
#101
Comments
I think I'm getting the same error here, still on 0.4.0 though. Here's my traceback: Traceback (most recent call last):
File "/workspace/virtualenvs/weatherdata/lib/python3.11/site-packages/urllib3/connection.py", line 203, in _new_conn
sock = connection.create_connection(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspace/virtualenvs/weatherdata/lib/python3.11/site-packages/urllib3/util/connection.py", line 85, in create_connection
raise err
File "/workspace/virtualenvs/weatherdata/lib/python3.11/site-packages/urllib3/util/connection.py", line 73, in create_connection
sock.connect(sa)
TimeoutError: [Errno 110] Connection timed out
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/workspace/virtualenvs/weatherdata/lib/python3.11/site-packages/urllib3/connectionpool.py", line 791, in urlopen
response = self._make_request(
^^^^^^^^^^^^^^^^^^^
File "/workspace/virtualenvs/weatherdata/lib/python3.11/site-packages/urllib3/connectionpool.py", line 492, in _make_request
raise new_e
File "/workspace/virtualenvs/weatherdata/lib/python3.11/site-packages/urllib3/connectionpool.py", line 468, in _make_request
self._validate_conn(conn)
File "/workspace/virtualenvs/weatherdata/lib/python3.11/site-packages/urllib3/connectionpool.py", line 1097, in _validate_conn
conn.connect()
File "/workspace/virtualenvs/weatherdata/lib/python3.11/site-packages/urllib3/connection.py", line 611, in connect
self.sock = sock = self._new_conn()
^^^^^^^^^^^^^^^^
File "/workspace/virtualenvs/weatherdata/lib/python3.11/site-packages/urllib3/connection.py", line 212, in _new_conn
raise ConnectTimeoutError(
urllib3.exceptions.ConnectTimeoutError: (<urllib3.connection.HTTPSConnection object at 0x7ffa8455ab10>, 'Connection to waterwebservices.rijkswaterstaat.nl timed out. (connect timeout=None)')
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/workspace/virtualenvs/weatherdata/lib/python3.11/site-packages/requests/adapters.py", line 486, in send
resp = conn.urlopen(
^^^^^^^^^^^^^
File "/workspace/virtualenvs/weatherdata/lib/python3.11/site-packages/urllib3/connectionpool.py", line 845, in urlopen
retries = retries.increment(
^^^^^^^^^^^^^^^^^^
File "/workspace/virtualenvs/weatherdata/lib/python3.11/site-packages/urllib3/util/retry.py", line 515, in increment
raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='waterwebservices.rijkswaterstaat.nl', port=443): Max retries exceeded with url: /ONLINEWAARNEMINGENSERVICES_DBO/OphalenWaarnemingen (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7ffa8455ab10>, 'Connection to waterwebservices.rijkswaterstaat.nl timed out. (connect timeout=None)'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "~/rws/rwsload.py", line 454, in <module>
dsn=sentry_dsn,
^^^^^^
File "~/rws/rwsload.py", line 436, in main
insertion_status = ReportsInsertionService.process_report(session=session, reports_data=result)
^^^^^^^^^^^^^^^^^^^^^^
File "~/rws/rwsload.py", line 115, in fetch_data
except JSONDecodeError:
^^^^
File "/workspace/virtualenvs/weatherdata/lib/python3.11/site-packages/ddlpy/ddlpy.py", line 357, in measurements
measurement = _measurements_slice(
^^^^^^^^^^^^^^^^^^^^
File "/workspace/virtualenvs/weatherdata/lib/python3.11/site-packages/ddlpy/ddlpy.py", line 301, in _measurements_slice
resp = requests.post(endpoint["url"], json=request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspace/virtualenvs/weatherdata/lib/python3.11/site-packages/requests/api.py", line 115, in post
return request("post", url, data=data, json=json, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspace/virtualenvs/weatherdata/lib/python3.11/site-packages/requests/api.py", line 59, in request
return session.request(method=method, url=url, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspace/virtualenvs/weatherdata/lib/python3.11/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspace/virtualenvs/weatherdata/lib/python3.11/site-packages/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspace/virtualenvs/weatherdata/lib/python3.11/site-packages/requests/adapters.py", line 507, in send
raise ConnectTimeout(e, request=request)
requests.exceptions.ConnectTimeout: HTTPSConnectionPool(host='waterwebservices.rijkswaterstaat.nl', port=443): Max retries exceeded with url: /ONLINEWAARNEMINGENSERVICES_DBO/OphalenWaarnemingen (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7ffa8455ab10>, 'Connection to waterwebservices.rijkswaterstaat.nl timed out. (connect timeout=None)')) |
@Weidav that could be the case, fixing this issue would prevent your process from being interrupted if there is a single timeout. There could of course also be a outage of the rijkswaterstaat server, in which case the process will fail either way. However, it is difficult to fix this problem, since we have no way to simulate a single timeout on the server side, so it is difficult to debug. This is also a nice to have feature, not as essential as the recently implemented developments. If you run into this issue again, please include a minimal example code to reproduce it, if it can be reproduced at least. |
This keeps happening on a regular basis. I use the Here's a little snipped from my code: EDIT: updated the csv again and I'm that leads to fewer exeptions with mesurements, I'll keep you updated. selected_stations = pandas.read_csv("selected_stations.csv", index_col=0)
# measurements-timezone is always in utc+1
one_h_ago = datetime.utcnow() - timedelta(hours=2.1)
tomorrow = datetime.utcnow() + timedelta(days=1, hours=1)
# iterate over my known spots
for rws_id, spot_id in spots_dict.items():
try:
station = selected_stations.loc[rws_id]
except KeyError:
logger.info(f"spot-id: {spot_id} source_station-id: {rws_id} has no measurements")
continue
# when a station has only one entry, it is usually incomplete and stored as a series
if type(station) is pandas.core.series.Series:
logger.debug(f"{spot_id} measurements are incomplete and will be ignored")
i = 0
# iterate over the the different measurement-types (wind, waves...) from this station
for index, station_data in station.iterrows():
try:
measurements = ddlpy.measurements(
station_data, start_date=one_h_ago, end_date=tomorrow
)
except JSONDecodeError:
continue
[...] |
Update: I keep running into the same issues, even with up to date locations / csv-file. |
Could you provide example code to reproduce the issue without any of your own files or local code? So a minimal code only requiring ddlpy and its dependencies. |
Description
Sometimes in the middle of data retrieval, the connection is aborted from the server side. This is an error that cannot be reproduced (and forgot to copy the traceback), but very inconvenient since it interrupts the download process.
Suggestion
Add
max_retries
parameter forrequests
to improve robustness of ddlpy.The text was updated successfully, but these errors were encountered: