Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[native] Add LinuxMemoryChecker check/warning to ensure system-mem-limit-gb is reasonably set #24149

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 9 additions & 4 deletions presto-docs/src/main/sphinx/presto_cpp/properties.rst
Original file line number Diff line number Diff line change
Expand Up @@ -142,13 +142,15 @@ The configuration properties of Presto C++ workers are described here, in alphab
^^^^^^^^^^^^^^^^^^^^

* **Type:** ``integer``
* **Default value:** ``40``
* **Default value:** ``57``

Memory allocation limit enforced by an internal memory allocator. It consists of two parts:
1) Memory used by the queries as specified in ``query-memory-gb``; 2) Memory used by the
system, such as disk spilling and cache prefetch.

Set ``system-memory-gb`` to the available machine memory of the deployment.
Set ``system-memory-gb`` to about 90% of available machine memory of the deployment.
This allows some buffer room to handle unaccounted memory in order to prevent out-of-memory conditions.
Default value of 57 gb is calculated based on available machine memory of 64 gb.


``query-memory-gb``
Expand Down Expand Up @@ -317,11 +319,14 @@ server is under low memory pressure.
^^^^^^^^^^^^^^^^^^^^^^^

* **Type:** ``integer``
* **Default value:** ``55``
* **Default value:** ``60``

Specifies the system memory limit that triggers the memory pushback or heap dump if
the server memory usage is beyond this limit. A value of zero means no limit is set.
This only applies if ``system-mem-pushback-enabled`` is ``true``.
This only applies if ``system-mem-pushback-enabled`` is ``true``.
Set ``system-mem-limit-gb`` to be greater than or equal to system-memory-gb but not
higher than the available machine memory of the deployment.
Default value of 60 gb is calculated based on available machine memory of 64 gb.

``system-mem-shrink-gb``
^^^^^^^^^^^^^^^^^^^^^^^^
Expand Down
141 changes: 126 additions & 15 deletions presto-native-execution/presto_cpp/main/LinuxMemoryChecker.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,12 @@
#include <sys/stat.h>
#include "presto_cpp/main/PeriodicMemoryChecker.h"
#include "presto_cpp/main/common/Configs.h"
#include "presto_cpp/main/common/Utils.h"

namespace facebook::presto {

using int128_t = __int128_t;

class LinuxMemoryChecker : public PeriodicMemoryChecker {
public:
explicit LinuxMemoryChecker(const PeriodicMemoryChecker::Config& config)
Expand All @@ -29,13 +32,32 @@ class LinuxMemoryChecker : public PeriodicMemoryChecker {
// it's mounted.
struct stat buffer;
if ((stat(kCgroupV1Path, &buffer) == 0)) {
statFile_ = kCgroupV1Path;
} else if ((stat(kCgroupV2Path, &buffer) == 0)) {
statFile_ = kCgroupV2Path;
} else {
statFile_ = "None";
PRESTO_STARTUP_LOG(INFO) << "Using cgroup v1.";
if (stat(kCgroupV1MemStatFile, &buffer) == 0) {
memStatFile_ = kCgroupV1MemStatFile;
}
if ((stat(kCgroupV1MaxMemFile, &buffer) == 0)) {
memMaxFile_ = kCgroupV1MaxMemFile;
}
}
LOG(INFO) << fmt::format("Using memory stat file {}", statFile_);

// In cgroup v2.
else {
PRESTO_STARTUP_LOG(INFO) << "Using cgroup v2.";
if (stat(kCgroupV2MemStatFile, &buffer) == 0) {
memStatFile_ = kCgroupV2MemStatFile;
}
if ((stat(kCgroupV2MaxMemFile, &buffer) == 0)) {
memMaxFile_ = kCgroupV2MaxMemFile;
}
}

PRESTO_STARTUP_LOG(INFO) << fmt::format(
"Using memory stat file: {}",
memStatFile_.empty() ? memInfoFile_ : memStatFile_);
PRESTO_STARTUP_LOG(INFO) << fmt::format(
"Using memory max file {}",
memMaxFile_.empty() ? memInfoFile_ : memMaxFile_);
}

~LinuxMemoryChecker() override {}
Expand All @@ -45,8 +67,91 @@ class LinuxMemoryChecker : public PeriodicMemoryChecker {
}

void setStatFile(std::string statFile) {
statFile_ = statFile;
LOG(INFO) << fmt::format("Changed to using memory stat file {}", statFile_);
memStatFile_ = statFile;
LOG(INFO) << fmt::format(
"Changed to using memory stat file {}", memStatFile_);
}

// This function is used for testing.
void setMemMaxFile(const std::string& memMaxFile) {
majetideepak marked this conversation as resolved.
Show resolved Hide resolved
memMaxFile_ = memMaxFile;
}

// This function is used for testing.
void setMemInfoFile(const std::string& memInfoFile) {
majetideepak marked this conversation as resolved.
Show resolved Hide resolved
memInfoFile_ = memInfoFile;
}

void start() override {
// Check system-memory-gb < system-mem-limit-gb < available machine memory
// of deployment.
auto* systemConfig = SystemConfig::instance();
int64_t systemMemoryInBytes = systemConfig->systemMemoryGb() << 30;
PRESTO_STARTUP_LOG(INFO)
<< fmt::format("System memory in bytes: {}", systemMemoryInBytes);
majetideepak marked this conversation as resolved.
Show resolved Hide resolved

PRESTO_STARTUP_LOG(INFO) << fmt::format(
"System memory limit in bytes: {}", config_.systemMemLimitBytes);

auto availableMemoryOfDeployment = getAvailableMemoryOfDeployment();
PRESTO_STARTUP_LOG(INFO) << fmt::format(
"Available machine memory of deployment in bytes: {}",
availableMemoryOfDeployment);

VELOX_CHECK_LE(
majetideepak marked this conversation as resolved.
Show resolved Hide resolved
config_.systemMemLimitBytes,
availableMemoryOfDeployment,
"system memory limit = {} bytes is higher than the available machine memory of deployment = {} bytes.",
config_.systemMemLimitBytes,
availableMemoryOfDeployment);

if (config_.systemMemLimitBytes < systemMemoryInBytes) {
LOG(WARNING) << "system-mem-limit-gb is smaller than system-memory-gb. "
<< "Expected: system-mem-limit-gb >= system-memory-gb.";
}

PeriodicMemoryChecker::start();
}

int128_t getAvailableMemoryOfDeployment() {
// Set the available machine memory of deployment to be the smaller number
// between /proc/meminfo and memMaxFile_.
int128_t availableMemoryOfDeployment = 0;
// meminfo's units is in kB.
folly::gen::byLine(memInfoFile_.c_str()) |
[&](const folly::StringPiece& line) -> void {
if (availableMemoryOfDeployment != 0) {
return;
}
availableMemoryOfDeployment = static_cast<int128_t>(
extractNumericConfigValueWithRegex(line, kMemTotalRegex) * 1024);
};

// For cgroup v1, memory.limit_in_bytes can default to a really big numeric
// value in bytes like 9223372036854771712 to represent that
majetideepak marked this conversation as resolved.
Show resolved Hide resolved
// memory.limit_in_bytes is not set to a value. The default value here is
majetideepak marked this conversation as resolved.
Show resolved Hide resolved
// set to PAGE_COUNTER_MAX, which is LONG_MAX/PAGE_SIZE on the 64-bit
// platform. The default value can vary based upon the platform's PAGE_SIZE.
// If memory.limit_in_bytes contains a really big numeric value, then we
// will use MemTotal from /proc/meminfo.

// For cgroup v2, memory.max can contain a numeric value in bytes or string
// "max" which represents no value has been set. If memory.max contains
// "max", then we will use MemTotal from /proc/meminfo.
if (!memMaxFile_.empty()) {
folly::gen::byLine(memMaxFile_.c_str()) |
[&](const folly::StringPiece& line) -> void {
if (line == "max") {
return;
}
availableMemoryOfDeployment =
std::min(availableMemoryOfDeployment, folly::to<int128_t>(line));
return;
};
}

// Unit is in bytes.
return availableMemoryOfDeployment;
}

protected:
Expand Down Expand Up @@ -80,8 +185,8 @@ class LinuxMemoryChecker : public PeriodicMemoryChecker {
size_t inactiveAnon = 0;
size_t activeAnon = 0;

if (statFile_ != "None") {
folly::gen::byLine(statFile_.c_str()) |
if (!memStatFile_.empty()) {
folly::gen::byLine(memStatFile_.c_str()) |
[&](const folly::StringPiece& line) -> void {
if (inactiveAnon == 0) {
inactiveAnon =
Expand All @@ -103,7 +208,7 @@ class LinuxMemoryChecker : public PeriodicMemoryChecker {
}

// Last resort use host machine info.
folly::gen::byLine("/proc/meminfo") |
folly::gen::byLine(memInfoFile_.c_str()) |
[&](const folly::StringPiece& line) -> void {
if (memAvailable == 0) {
memAvailable =
Expand Down Expand Up @@ -143,10 +248,16 @@ class LinuxMemoryChecker : public PeriodicMemoryChecker {
const boost::regex kInactiveAnonRegex{R"!(inactive_anon\s*(\d+)\s*)!"};
const boost::regex kActiveAnonRegex{R"!(active_anon\s*(\d+)\s*)!"};
const boost::regex kMemAvailableRegex{R"!(MemAvailable:\s*(\d+)\s*kB)!"};
const boost::regex kMemTotalRegex{R"!(MemTotal:\s*(\d+)\s*kB)!"};
const char* kCgroupV1Path = "/sys/fs/cgroup/memory/memory.stat";
const char* kCgroupV2Path = "/sys/fs/cgroup/memory.stat";
std::string statFile_;
const boost::regex kMemTotalRegex{R"!(MemTotal:\s*(\d+)\s+kB)!"};
const char* kCgroupV1Path = "/sys/fs/cgroup/memory";
const char* kCgroupV1MemStatFile = "/sys/fs/cgroup/memory/memory.stat";
const char* kCgroupV2MemStatFile = "/sys/fs/cgroup/memory.stat";
const char* kCgroupV1MaxMemFile =
"/sys/fs/cgroup/memory/memory.limit_in_bytes";
const char* kCgroupV2MaxMemFile = "/sys/fs/cgroup/memory.max";
std::string memInfoFile_ = "/proc/meminfo";
minhancao marked this conversation as resolved.
Show resolved Hide resolved
std::string memStatFile_;
std::string memMaxFile_;

size_t extractNumericConfigValueWithRegex(
const folly::StringPiece& line,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -71,10 +71,10 @@ class PeriodicMemoryChecker {

/// Starts the 'PeriodicMemoryChecker'. A background scheduler will be
/// launched to perform the checks. This should only be called once.
void start();
virtual void start();

/// Stops the 'PeriodicMemoryChecker'.
void stop();
virtual void stop();

protected:
/// Returns current system memory usage. The returned value is used to compare
Expand Down
4 changes: 2 additions & 2 deletions presto-native-execution/presto_cpp/main/common/Configs.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -168,9 +168,9 @@ SystemConfig::SystemConfig() {
STR_PROP(kSpillerDirectoryCreateConfig, ""),
NONE_PROP(kSpillerSpillPath),
NUM_PROP(kShutdownOnsetSec, 10),
NUM_PROP(kSystemMemoryGb, 40),
NUM_PROP(kSystemMemoryGb, 57),
BOOL_PROP(kSystemMemPushbackEnabled, false),
NUM_PROP(kSystemMemLimitGb, 55),
NUM_PROP(kSystemMemLimitGb, 60),
NUM_PROP(kSystemMemShrinkGb, 8),
BOOL_PROP(kMallocMemHeapDumpEnabled, false),
BOOL_PROP(kSystemMemPushbackAbortEnabled, false),
Expand Down
Loading
Loading