Skip to content

Commit

Permalink
fixed: HcfMiddleware generates fewer links than requested. Solved s…
Browse files Browse the repository at this point in the history
  • Loading branch information
starrify committed Mar 26, 2015
1 parent 27663d4 commit c00f468
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion scrapylib/hcf.py
Original file line number Diff line number Diff line change
Expand Up @@ -201,7 +201,8 @@ def _get_new_requests(self):
""" Get a new batch of links from the HCF."""
num_batches = 0
num_links = 0
for num_batches, batch in enumerate(self.fclient.read(self.hs_frontier, self.hs_consume_from_slot), 1):
for batch in self.fclient.read(self.hs_frontier, self.hs_consume_from_slot, mincount=self.hs_max_links):
num_batches += 1
for fingerprint, data in batch['requests']:
num_links += 1
yield Request(url=fingerprint, meta={'hcf_params': {'qdata': data}})
Expand Down

0 comments on commit c00f468

Please sign in to comment.