Skip to content

Commit

Permalink
Improve cache key generation based on image URLs
Browse files Browse the repository at this point in the history
The protocol is ininfluent; for the size, we can tolerate small
differences, and this way we can prefetch the next image for on-wiki
requests.

Bug: T286356
  • Loading branch information
Daimona committed Jul 15, 2021
1 parent 8fc5eb5 commit 9766d7f
Showing 1 changed file with 18 additions and 1 deletion.
19 changes: 18 additions & 1 deletion src/Controller/OcrController.php
Original file line number Diff line number Diff line change
Expand Up @@ -225,7 +225,7 @@ private function getText(): string
$cacheKey = md5(implode(
'|',
[
$this->imageUrl,
self::transformImageURLForCacheKey($this->imageUrl),
static::$params['engine'],
implode('|', static::$params['langs']),
static::$params['psm'],
Expand All @@ -237,4 +237,21 @@ private function getText(): string
return $this->engine->getText($this->imageUrl, static::$params['langs']);
});
}

/**
* Make an image URL suitable to be used as a cache key (e.g. strip protocol)
* @param string $url
* @return string
*/
private static function transformImageURLForCacheKey(string $url): string
{
return preg_replace_callback(
'/(page\d+-)(\d+)px/',
static function (array $matches) {
// Tolerate ±50px, see T286356.
return $matches[1].( round($matches[2] / 100) * 100 ).'px';
},
preg_replace('/^https?:/i', '', $url)
);
}
}

0 comments on commit 9766d7f

Please sign in to comment.