-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generalize function caching code #998
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1272,6 +1272,38 @@ function factor_strings($strings) | |
return [$left_common, $middles, $right_common]; | ||
} | ||
|
||
//-------------------------------------------------------------------------- | ||
|
||
/** | ||
* Given a function and arguments that will generate data, attempt | ||
* to load the data from memcached first and fall back to the function and | ||
* cache the response. If memcache is not running, this will call the requested | ||
* function and return its data without generating a notice/warning/error. | ||
* | ||
* If the requested function returns data that has localized strings, it's | ||
* important that $key_salt = get_desired_language(); | ||
*/ | ||
function memoize_function($function, $args = [], $expire_from_now = 3600, $key_salt = "") | ||
{ | ||
$memcache = new Memcached(); | ||
$memcache->addServer('localhost', 11211); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what if there is no memcache server? Is this setup in configuration somewhere? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If there is no local memcache server running, or it's not connected, all of the memcache calls return
So having memcache is effectively optional and if it isn't there or isn't working there won't be any caching. The API uses memcache for rate limiting and this is documented in the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've updated both There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perfect, thank you! |
||
|
||
$key = hash("sha256", $function . serialize($args) . $key_salt); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. sha256 is resistant to collision attacks, this is a property we probably don't need for caching. Should we instead use something faster? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I changed it from md5 to sha256 to reduce the collision space since this has no control over the function name, args, or salt. I don't know what the odds are of a collision with md5 though, so it might not matter and md5 might be good enough. |
||
|
||
// if the key exists, just return the data uncompressed | ||
$data = $memcache->get($key); | ||
if ($data !== false) { | ||
return unserialize(gzuncompress($data)); | ||
} | ||
|
||
// if not, call the function to return the data | ||
$data = call_user_func_array($function, $args); | ||
|
||
$memcache->set($key, gzcompress(serialize($data)), time() + $expire_from_now); | ||
|
||
return $data; | ||
} | ||
|
||
// ----------------------------------------------------------------------------- | ||
|
||
/** | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we unconditionally add? Per https://www.php.net/manual/en/memcached.addserver.php it seems not advised to add the same server multiple times to the same server pool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function creates the memcached object at every call so we need to add it for every call. The object itself doesn't persist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh yes, my bad. I guess the next question is should we not destroy it every call and instead keep the pool of servers alive? Does it have a strong impact to performance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Connection pooling is difficult with PHP as each page load is a distinct "execution" in most cases (this is not entirely accurate depending on how PHP is wired into the web server (eg Apache) and what multi-processing model is used (ie MPM)). It's this complexity that has us not pooling or using persistent DB connections either.
The per-instances loopback connections (for both memcache and the database) should be pretty lighweight and at least in this case we're just trading the MySQL connection + query complexity for a memcached connection + constant-time hash lookup. I will see if we can cache the connection within a page load via a static variable though, for the case where we have multiple calls to this function per page load -- let me test that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've cached the connection in a static function variable so we will at least re-use it in a single page load if there are multiple calls to the function.