-
Notifications
You must be signed in to change notification settings - Fork 389
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support host monitor #1890
base: main
Are you sure you want to change the base?
Conversation
6150e52
to
bfdd9c2
Compare
@@ -58,6 +58,8 @@ enum class EventGroupMetaKey { | |||
PROMETHEUS_SCRAPE_TIMESTAMP_MILLISEC, | |||
PROMETHEUS_UP_STATE, | |||
|
|||
HOST_MONITOR_COLLECT_TIME, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个字段是否不可以跟具体业务无关,作为通用字段。
LOG_DEBUG( | ||
sLogger, | ||
("send http request succeeded, item address", request->mItem)( | ||
"config-flusher-dst", QueueKeyManager::GetInstance()->GetName(request->mItem->mQueueKey))( | ||
"response time", ToString(responseTimeMs) + "ms")("try cnt", ToString(request->mTryCnt))( | ||
"sending cnt", ToString(FlusherRunner::GetInstance()->GetSendingBufferCount()))); | ||
static_cast<HttpFlusher*>(request->mItem->mFlusher)->OnSendDone(request->mResponse, request->mItem); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个文件改动原因是什么
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个debug日志会有core,遗留问题
* limitations under the License. | ||
*/ | ||
|
||
#include "MockCollector.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
del
|
||
namespace logtail { | ||
|
||
CollectorManager::CollectorManager() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
没太大必要删掉吧
const std::string ProcessorHostMetaNative::sName = "processor_host_meta_native"; | ||
|
||
bool ProcessorHostMetaNative::Init(const Json::Value& config) { | ||
auto hostType = ToString(getenv(DEFAULT_ENV_KEY_HOST_TYPE.c_str())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个机制也不合理。应该有个全局的管理中心,各业务方直接读值即可。这个事情日会上提下,讨论下。
|
||
// for process entity | ||
const std::string DEFAULT_CONTENT_VALUE_ENTITY_TYPE_PROCESS = "process"; | ||
const std::string DEFAULT_CONTENT_KEY_PROCESS_PID = "process_pid"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
尽量与 TagConstants.cpp 中,日志、指标类型保持一致。
同时看看node-exporter的指标label做下参考。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
node exporter的指标也是以下划线连接
node_固定前缀 + 指标类型 + 指标
e.g. node_memory_HugePages_Total
namespace logtail { | ||
|
||
int64_t GetSystemBootSeconds() { | ||
static int64_t systemBootSeconds; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个文件,是不是应该是collector的一部分?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个是系统级的指标,会有多个collector用到。比如这个获取启动时间,cpu collector和process collector都要用到
targetEvent->SetContent("binary", sourceEvent.GetContent(DEFAULT_CONTENT_KEY_PROCESS_BINARY)); | ||
targetEvent->SetContent("arguments", sourceEvent.GetContent(DEFAULT_CONTENT_KEY_PROCESS_ARGUMENTS)); | ||
targetEvent->SetContent("language", sourceEvent.GetContent(DEFAULT_CONTENT_KEY_PROCESS_LANGUAGE)); | ||
targetEvent->SetContent("containerID", sourceEvent.GetContent(DEFAULT_CONTENT_KEY_PROCESS_CONTAINER_ID)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DEFAULT_CONTENT_KEY_PROCESS_CONTAINER_ID没有set的地方? 另外也不一定是container。
|
||
const size_t ProcessTopN = 20; | ||
|
||
void ProcessCollector::Collect(PipelineEventGroup& group) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
并发的必要性是什么?
std::string mDomain; | ||
std::string mEntityType; | ||
std::string mHostEntityID; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这三个成员有用吗?
bool Init(const Json::Value& config, Json::Value& optionalGoPipeline) override; | ||
bool Start() override; | ||
bool Stop(bool isPipelineRemoving) override; | ||
bool SupportAck() const override { return false; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
先改成true
|
||
bool InputHostMeta::Stop(bool isPipelineRemoving) { | ||
LOG_INFO(sLogger, ("input host meta stop", mContext->GetConfigName())); | ||
HostMonitorInputRunner::GetInstance()->RemoveCollector(mContext->GetConfigName()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
如果pipelineremoving是false,不要调用
bool InputHostMeta::Start() { | ||
LOG_INFO(sLogger, ("input host meta start", mContext->GetConfigName())); | ||
HostMonitorInputRunner::GetInstance()->Init(); | ||
HostMonitorInputRunner::GetInstance()->UpdateCollector( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
函数里面判断有没有过这个config,决定是不是要加初始事件
if (ProcessQueueManager::GetInstance()->IsValidToPush(processQueueKey)) { | ||
ProcessQueueManager::GetInstance()->PushQueue(processQueueKey, std::move(item)); | ||
} else { | ||
std::this_thread::sleep_for(std::chrono::milliseconds(100)); | ||
// try again | ||
if (ProcessQueueManager::GetInstance()->IsValidToPush(processQueueKey)) { | ||
ProcessQueueManager::GetInstance()->PushQueue(processQueueKey, std::move(item)); | ||
} else { | ||
LOG_WARNING(sLogger, ("process queue is full", "discard data")("config", configName)); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
用ProcessorRunner::GetInstance()->PushQueue
mThreadPool->Add([this, eventCopy]() mutable { | ||
auto configName = eventCopy.GetConfigName(); | ||
auto collectorName = eventCopy.GetCollectorName(); | ||
auto processQueueKey = eventCopy.GetProcessQueueKey(); | ||
PipelineEventGroup group(std::make_shared<SourceBuffer>()); | ||
auto collector = CollectorManager::GetInstance()->GetCollector(collectorName); | ||
collector->Collect(group); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
感觉这边不太对,不应该有copy
LOG_DEBUG(sLogger, ("schedule host monitor collector again", configName)("collector", collectorName)); | ||
|
||
eventCopy.ResetForNextExec(); | ||
mTimer->PushEvent(std::make_unique<HostMonitorTimerEvent>(eventCopy)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这边直接构造新的
4110e3d
to
3e12438
Compare
TODO: