自建Prometheus获取不到聚合指标 #7

Quintonwong · 2022-06-26T12:58:38Z

1、看crane-scheduler-controller日志发现聚合指标的监控项指标都获取不到
W0626 20:55:02.198329 1 node.go:61] failed to sync this node ["k8s-node4/mem_usage_avg_5m"]: can not annotate node[k8s-node4]: failed to get data mem_usage_avg_5m{k8s-node4=}:
2、

autumn0207 · 2022-06-26T14:39:23Z

@Quintonwong

First, check if aggregated metrics data can be pulled inside the container:

curl -g 'http://{REPLACE_ME_WITH_PROMETHEUS_ADDRESS}/api/v1/query?query=cpu_usage_avg_5m'

curl -g 'http://{REPLACE_ME_WITH_PROMETHEUS_ADDRESS}/api/v1/query?query=mem_usage_avg_5m'

Then, check non-aggregated metrics data：

curl -g 'http://{REPLACE_ME_WITH_PROMETHEUS_ADDRESS}/api/v1/query?query=up'

If the non-aggregated metrics data is ok but non-aggregated metrics data cannot be pulled, it indicates that the prometheus rules does not take effect, please refer to https://prometheus.io/docs/prometheus/latest/configuration/configuration

ArvinChen1991 · 2022-06-27T00:32:55Z

@Quintonwong

First, check if aggregated metrics data can be pulled inside the container:
curl -g 'http://{REPLACE_ME_WITH_PROMETHEUS_ADDRESS}/api/v1/query?query=cpu_usage_avg_5m'
curl -g 'http://{REPLACE_ME_WITH_PROMETHEUS_ADDRESS}/api/v1/query?query=mem_usage_avg_5m'
Then, check non-aggregated metrics data：
curl -g 'http://{REPLACE_ME_WITH_PROMETHEUS_ADDRESS}/api/v1/query'
If the non-aggregated metrics data is ok but non-aggregated metrics data cannot be pulled, it indicates that the prometheus rules does not take effect, please refer to https://prometheus.io/docs/prometheus/latest/configuration/configuration

output error
curl -g 'http://x.x.x.x:9090/api/v1/query'
{"status":"error","errorType":"bad_data","error":"invalid parameter 'query': parse error at char 1: no expression found in input"}

autumn0207 · 2022-06-29T04:42:28Z

curl -g 'http://x.x.x.x:9090/api/v1/query'

I made a mistake, the command should be

curl -g 'http://{REPLACE_ME_WITH_PROMETHEUS_ADDRESS}/api/v1/query?query=up'

ArvinChen1991 · 2022-07-01T01:46:17Z

curl -g 'http://x.x.x.x:9090/api/v1/query'

I made a mistake, the command should be
curl -g 'http://{REPLACE_ME_WITH_PROMETHEUS_ADDRESS}/api/v1/query?query=up'

Return Success

xieydd · 2022-12-09T03:07:21Z

I think you can increase second intervals of cpu_usage_active.

sdnmw · 2023-03-19T07:29:33Z

I have same Problem。kubernetes version:1.23.10，crane version: v0.5.1,crane-scheduler-controller:v0.1.23.

I have checked the aggregated metrics data and non-aggregated metrics data, both can be obtained, and the modification interval of cpu_usage_active is 5s, but I still cannot obtain the data and annotate Node.

W0319 15:26:24.293385 1 node.go:61] failed to sync this node ["kse2/cpu_usage_avg_5m"]: can not annotate node[kse2]: failed to get data cpu_usage_avg_5m{kse2=}:
I0319 15:26:24.295764 1 node.go:75] Finished syncing node event "kse3/cpu_usage_avg_5m" (2.357063ms)
W0319 15:26:24.295781 1 node.go:61] failed to sync this node ["kse3/cpu_usage_avg_5m"]: can not annotate node[kse3]: failed to get data cpu_usage_avg_5m{kse3=}:
I0319 15:26:24.298258 1 node.go:75] Finished syncing node event "kse4/cpu_usage_avg_5m" (2.454873ms)
W0319 15:26:24.298279 1 node.go:61] failed to sync this node ["kse4/cpu_usage_avg_5m"]: can not annotate node[kse4]: failed to get data cpu_usage_avg_5m{kse4=}:

Could you help me @xieydd ，Thanks very much.

nailianglu · 2023-03-30T05:19:39Z

@Quintonwong

首先，检查是否可以将聚合的指标数据拉入容器:
curl -g 'http://{REPLACE_ME_WITH_PROMETHEUS_ADDRESS}/api/v1/query?query=cpu_usage_avg_5m'
curl -g 'http://{REPLACE_ME_WITH_PROMETHEUS_ADDRESS}/api/v1/query?query=mem_usage_avg_5m'
然后，检查非聚合指标数据:
curl -g 'http://{REPLACE_ME_WITH_PROMETHEUS_ADDRESS}/api/v1/query?query=up'
如果非聚合指标数据正常，但无法拉取非聚合指标数据，则表明普罗米修斯规则没有生效，请参考https://普罗米修斯. io/docs/普罗米修斯/最新/配置/配置

你好，我也是遇到这个问题，进入到crane-scheduler-controller容器，可以获取到聚合数据，但是crane-scheduler-controller容器日志一直提示错误：I0330 13:18:01.658598 1 node.go:75] Finished syncing node event "cn-hangzhou.i-bp19r762s7xryoo6fjmx/mem_usage_avg_5m" (35.978µs)
W0330 13:18:01.658604 1 node.go:61] failed to sync this node ["cn-hangzhou.i-bp19r762s7xryoo6fjmx/mem_usage_avg_5m"]: can not annotate node[cn-hangzhou.i-bp19r762s7xryoo6fjmx]: failed to get data mem_usage_avg_5m{cn-hangzhou.i-bp19r762s7xryoo6fjmx=}: Post "10.7.1.60/api/v1/query": unsupported protocol scheme ""

wyaopeng · 2023-12-19T08:00:06Z

升级promeetheus和node-exporter至最新版本试下

niyang110 · 2024-04-25T07:25:40Z

@sdnmw 取不到值的原因是，crane会把nodename 转换为节点ip，用节点ip作为instance标签的值去Prometheus去查询的。

出现这种情况，应该是在K8S中部署的node_exporter，可以在Prometheus中抓取node-exporter加上标签的重置
- source_labels: [__meta_kubernetes_node_address_InternalIP]
target_label: instance
action: replace
- source_labels: [__meta_kubernetes_node_address_Hostname]
target_label: instance_name
action: replace

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

自建Prometheus获取不到聚合指标 #7

自建Prometheus获取不到聚合指标 #7

Quintonwong commented Jun 26, 2022

autumn0207 commented Jun 26, 2022 •

edited

Loading

ArvinChen1991 commented Jun 27, 2022

autumn0207 commented Jun 29, 2022

ArvinChen1991 commented Jul 1, 2022

xieydd commented Dec 9, 2022

sdnmw commented Mar 19, 2023

nailianglu commented Mar 30, 2023

wyaopeng commented Dec 19, 2023

niyang110 commented Apr 25, 2024 •

edited

Loading

自建Prometheus获取不到聚合指标 #7

自建Prometheus获取不到聚合指标 #7

Comments

Quintonwong commented Jun 26, 2022

autumn0207 commented Jun 26, 2022 • edited Loading

ArvinChen1991 commented Jun 27, 2022

autumn0207 commented Jun 29, 2022

ArvinChen1991 commented Jul 1, 2022

xieydd commented Dec 9, 2022

sdnmw commented Mar 19, 2023

nailianglu commented Mar 30, 2023

wyaopeng commented Dec 19, 2023

niyang110 commented Apr 25, 2024 • edited Loading

autumn0207 commented Jun 26, 2022 •

edited

Loading

niyang110 commented Apr 25, 2024 •

edited

Loading