Implementing fetchet.server.min.delay customization per domain/host/ip #889
jcruzmartini
started this conversation in
Ideas
Replies: 1 comment 1 reply
-
that could be a valuable contrib, thanks @jcruzmartini |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi @jnioche, we have been dealing with some issues with a specific domain that has tons of URLs. Having in mind the
fetcher.server.min.delay
is the min time that we want to wait until hitting this domain again, I am wondering if you consider that may be a good idea to make this parameter configurable by domain/host/ip exactly the same that you are doing for fetch max threadsfetcher.maxThreads.host/domain/ip
I am thinking in something like:
fetcher.server.min.delay.host/domain/ip
fetcher.server.delay.host/domain/ip
so this will allow us to customize the crawler to hit a specific domain with more frequency than the rest of the domains/ip/host
I can create a pull request with this proposal if you consider that is a feature that may be useful for the project.
Thanks
Juan
Beta Was this translation helpful? Give feedback.
All reactions