You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using the crawler I've encountered with issue regarding the max_num.
If cases where less images found than the provided "max_num", a infinite loop begin, stop any previous result from actually providing information.
Expected result should be immidiate stop when there nothing left to download, and exit safely.
The following example is for the greedy crawler with a url provided to use flicker search engine like so:
INFO - downloader - image #27\thttps://combo.staticflickr.com/pw/images/favicons/f>
INFO - parser - parser-001 is waiting for new page urls
INFO - parser - parser-002 is waiting for new page urls
INFO - parser - parser-004 is waiting for new page urls
INFO - parser - parser-003 is waiting for new page urls
INFO - parser - parser-001 is waiting for new page urls
INFO - parser - parser-002 is waiting for new page urls
INFO - parser - parser-004 is waiting for new page urls
INFO - parser - parser-003 is waiting for new page urls
INFO - parser - parser-001 is waiting for new page urls
INFO - downloader - downloader-001 is waiting for new download tasks
INFO - parser - parser-002 is waiting for new page urls
INFO - parser - parser-003 is waiting for new page urls
INFO - parser - parser-004 is waiting for new page urls
INFO - parser - parser-001 is waiting for new page urls
INFO - parser - parser-002 is waiting for new page urls
INFO - parser - parser-003 is waiting for new page urls
INFO - parser - parser-004 is waiting for new page urls
The text was updated successfully, but these errors were encountered:
Any updates?
If not, in the mean time can you please introduce a timeout mechanism? say 30s or so. This way the crawler will still be functional / operational. At the moment it is unstable due to that, to the point where I cannot use it :(
When using the crawler I've encountered with issue regarding the max_num.
If cases where less images found than the provided "max_num", a infinite loop begin, stop any previous result from actually providing information.
Expected result should be immidiate stop when there nothing left to download, and exit safely.
The following example is for the greedy crawler with a url provided to use flicker search engine like so:
result: (downloaded image #27 and then inf loop)
The text was updated successfully, but these errors were encountered: