Skip to content

Commit

Permalink
Fixed a bug in the result folder cleanup at the worker start up time
Browse files Browse the repository at this point in the history
In the original implementation of the file-based result delivery protocol,
workers were attempting to clean up unclaimed files left in workers'
result folders regardless of the protocol option. This clearly was
a mistake for the SSI protocol option where the folder wasn't
required to exist. As a result of this, the application posts
confusing warnings in the logging stream. This was fixed.

Another problem that was fixed was related to the protocol options HTTP
and XROOT where the application wouldn't abort right away if the required
folder didn't exist during the folder cleanup attempt. Not having this
folder available in this scenario means a configuration error or
a problem with the infrastructure.
  • Loading branch information
iagaponenko committed Sep 14, 2023
1 parent ba520b1 commit 386d0b0
Show file tree
Hide file tree
Showing 2 changed files with 12 additions and 4 deletions.
12 changes: 9 additions & 3 deletions src/wbase/FileChannelShared.cc
Original file line number Diff line number Diff line change
Expand Up @@ -54,10 +54,15 @@ LOG_LOGGER _log = LOG_GET("lsst.qserv.wbase.FileChannelShared");
/**
* Iterate over the result files at the results folder and remove those
* which satisfy the desired criteria.
* @note The folder must exist when this function gets called. Any other
* scenario means a configuration error or a problem with the infrastructure.
* Running into either of these problems should result in the abort of
* the application.
* @param context The calling context (used for logging purposes).
* @param fileCanBeRemoved The optional validator to be called for each candidate file.
* Note that missing validator means "yes" the candidate file can be removed.
* @return The total number of removed files.
* @throws std::runtime_error If the results folder doesn't exist.
*/
size_t cleanUpResultsImpl(string const& context, fs::path const& dirPath,
function<bool(string const&)> fileCanBeRemoved = nullptr) {
Expand All @@ -66,9 +71,10 @@ size_t cleanUpResultsImpl(string const& context, fs::path const& dirPath,
boost::system::error_code ec;
auto itr = fs::directory_iterator(dirPath, ec);
if (ec.value() != 0) {
LOGS(_log, LOG_LVL_WARN,
context << "failed to open the results folder " << dirPath << ", ec: " << ec << ".");
return numFilesRemoved;
string const err = context + "failed to open the results folder '" + dirPath.string() +
"', ec: " + to_string(ec.value()) + ".";
LOGS(_log, LOG_LVL_ERROR, err);
throw runtime_error(err);
}
for (auto&& entry : boost::make_iterator_range(itr, {})) {
auto filePath = entry.path();
Expand Down
4 changes: 3 additions & 1 deletion src/xrdsvc/SsiService.cc
Original file line number Diff line number Diff line change
Expand Up @@ -188,7 +188,9 @@ SsiService::SsiService(XrdSsiLogger* log)
// ATTENTION: this is the blocking operation since it needs to be run before accepting
// new queries to ensure that worker had sufficient resources to process those.
if (workerConfig->resultsCleanUpOnStart()) {
wbase::FileChannelShared::cleanUpResultsOnWorkerRestart();
if (workerConfig->resultDeliveryProtocol() != wconfig::WorkerConfig::ResultDeliveryProtocol::SSI) {
wbase::FileChannelShared::cleanUpResultsOnWorkerRestart();
}
}
}

Expand Down

0 comments on commit 386d0b0

Please sign in to comment.