Skip to content

Commit

Permalink
Merge pull request #3 from PanKaker/main
Browse files Browse the repository at this point in the history
Update to 1.2.0.
  • Loading branch information
PanKaker authored Dec 2, 2022
2 parents f77fb01 + 537a9bf commit c092e63
Show file tree
Hide file tree
Showing 23 changed files with 102 additions and 73 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
.idea
6 changes: 6 additions & 0 deletions .markdownlint.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
{
"MD013": false,
"MD033": {
"allowed_elements": ["br"]
}
}
71 changes: 49 additions & 22 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,25 +19,37 @@ Pre-configuration is needed for a container to read metrics from specific plugin

### [Intel PowerStat plugin](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/intel_powerstat)

Plugin is based on Linux Kernel modules that expose specific metrics over `sysfs` or `devfs` interfaces.
The following dependencies are expected by plugin:

- _intel-rapl_ module which exposes Intel Runtime Power Limiting metrics over `sysfs` (`/sys/devices/virtual/powercap/intel-rapl`),
- _msr_ kernel module that provides access to processor model specific registers over `devfs` (`/dev/cpu/cpu%d/msr`),
- _cpufreq_ kernel module - which exposes per-CPU Frequency over `sysfs` (`/sys/devices/system/cpu/cpu%d/cpufreq/scaling_cur_freq`).

Minimum kernel version required is 3.13 to satisfy all requirements.

Please make sure that kernel modules are loaded and running (cpufreq is integrated in kernel). Modules might have to be manually enabled by using `modprobe`.
Depending on the kernel version, run commands:
Plugin is based on Linux Kernel modules that expose specific metrics over
`sysfs` or `devfs` interfaces. The following dependencies are expected by
plugin:

- _intel-rapl_ module which exposes Intel Runtime Power Limiting metrics over
`sysfs` (`/sys/devices/virtual/powercap/intel-rapl`),
- _msr_ kernel module that provides access to processor model specific
registers over `devfs` (`/dev/cpu/cpu%d/msr`),
- _cpufreq_ kernel module - which exposes per-CPU Frequency over `sysfs`
(`/sys/devices/system/cpu/cpu%d/cpufreq/scaling_cur_freq`).
- _intel-uncore-frequency_ module exposes Intel uncore frequency metrics
over `sysfs` (`/sys/devices/system/cpu/intel_uncore_frequency`),

Minimum kernel version required is 3.13 to satisfy most of requirements,
for `uncore_frequency` metrics `intel-uncore-frequency` module is required
(available since kernel 5.6).

Please make sure that kernel modules are loaded and running (cpufreq is
integrated in kernel). Modules might have to be manually enabled by using
`modprobe`. Depending on the kernel version, run commands:

```sh
# kernel 5.x.x:
sudo modprobe rapl
subo modprobe msr
sudo modprobe msr
sudo modprobe intel_rapl_common
sudo modprobe intel_rapl_msr

# also for kernel >= 5.6.0
sudo modprobe intel-uncore-frequency

# kernel 4.x.x:
sudo modprobe msr
sudo modprobe intel_rapl
Expand All @@ -53,12 +65,16 @@ The Redfish plugin needs hardware servers for which
- The DPDK plugin needs external application built with
[Data Plane Development Kit](https://www.dpdk.org/).
- `./telegraf-intel-docker.sh` has default location of DPDK socket -`/var/run/dpdk/rte`, if DPDK socket is located
somewhere else, user must specify this in running stage providing `--dpdk_socket_path` flag.
somewhere else, user must specify this in running stage providing `--dpdk_socket_path` flag. Providing path to a
directory that contains the hosts' own Docker socket file is not recommended.

### [Intel PMU](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/intel_pmu)

The plugin requires JSON files with event definitions to work properly. Those can be specified in `./telegraf-intel-docker.sh`
by providing `--pmu_events` parameter. More information about event definitions and where to get them should be found in plugin's
by providing `--pmu_events` parameter. Providing path to a directory that contains the hosts' own Docker socket file
is not recommended.

More information about event definitions and where to get them should be found in plugin's
[README](https://github.com/influxdata/telegraf/blob/master/plugins/inputs/intel_pmu/README.md).

### [RAS](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/ras)
Expand Down Expand Up @@ -201,14 +217,15 @@ The following plugins should work on a majority of the host's configurations.
4. [Disk IO](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/diskio)
5. [DNS Query](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/dns_query)
6. [ETH Tool](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/ethtool)
7. [IP Tables](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/iptables)
8. [Kernel VMStat](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/kernel_vmstat)
9. [Mem](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/mem)
10. [Net](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/net)
11. [Ping](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/ping)
12. [Smart](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/smart)
13. [System](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/system)
14. [Temp](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/temp)
7. [Hugepages](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/hugepages)
8. [IP Tables](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/iptables)
9. [Kernel VMStat](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/kernel_vmstat)
10. [Mem](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/mem)
11. [Net](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/net)
12. [Ping](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/ping)
13. [Smart](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/smart)
14. [System](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/system)
15. [Temp](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/temp)

#### Disabled by default

Expand All @@ -230,3 +247,13 @@ List of supported Telegraf output plugins enabled by default.

1. [File](https://github.com/influxdata/telegraf/tree/master/plugins/outputs/file)
2. [Prometheus client](https://github.com/influxdata/telegraf/tree/master/plugins/outputs/prometheus_client)

### Changelog

#### 1.2.0

- Update telegraf version: 1.21.3 -> 1.24.3
- Update version of pqos (intel_cmt_cat): 4.2.0 -> 4.4.1
- Add Hugepages plugin (enabled by default)
- Add new features: uncore_freq and max_turbo_freq for Powerstat plugin
- Update the final alpine image: 3.15 -> 3.16
18 changes: 7 additions & 11 deletions images/telegraf/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -22,15 +22,15 @@
#

ARG TELEGRAF_TAG
ARG FINAL_BASE_IMAGE
FROM telegraf:$TELEGRAF_TAG AS builder

RUN apk update && \
apk add --update --no-cache \
zlib=~1.2.12 \
build-base=~0.5 \
bash=~5.1 \
sudo=~1.9 \
git=~2.34 \
git=~2.36 \
alpine-sdk=~1.0 \
libffi-dev=~3.4 \
openssl-dev=~1.1 && \
Expand All @@ -43,27 +43,25 @@ COPY ./images/telegraf/script/install_intel_cmt_cat.sh ./
RUN ./install_intel_cmt_cat.sh

# Build final image where telegraf is running.
FROM alpine:3.15.0
FROM $FINAL_BASE_IMAGE

RUN export PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" && \
apk update && \
apk add --update --no-cache \
zlib=~1.2.12 \
util-linux=~2.37 \
util-linux=~2.38 \
bash=~5.1 \
sudo=~1.9 \
smartmontools=~7.2 \
nvme-cli=~1.16 \
smartmontools=~7.3 \
nvme-cli=~2.0 \
rasdaemon=~0.6 \
iptables=~1.8 \
ipmitool=~1.8 \
su-exec=~0.2 \
iputils=~20210722 && \
iputils=~20211215 && \
rm -rf /var/lib/apt/lists/*

WORKDIR "/telegraf_docker/"

COPY ./images/telegraf/script/sudo_configuration.sh .
COPY ./DEPENDENCIES_SOURCE_CODE_INFO.md .

# Copy dependecies from previous step called builder, where intel-cmt-cat is installed.
Expand All @@ -73,8 +71,6 @@ COPY --from=builder /usr/share/snmp/mibs/ /usr/share/snmp/mibs/
COPY --from=builder /usr/bin/telegraf /usr/bin/telegraf
COPY --from=builder entrypoint.sh .

# Set sudo config for specific plugins, and just in case update alternatives, because sometimes docker can mess it up.
RUN ./sudo_configuration.sh

# Override entrypoint
ENTRYPOINT ["sh", "-c", "rasdaemon -r && sleep 10s && ./entrypoint.sh telegraf"]
2 changes: 1 addition & 1 deletion images/telegraf/script/install_intel_cmt_cat.sh
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#!/bin/bash

git clone -b v4.2.0 https://github.com/intel/intel-cmt-cat.git && cd intel-cmt-cat || exit
git clone -b v4.4.1 https://github.com/intel/intel-cmt-cat.git && cd intel-cmt-cat || exit

make
sudo NOLDCONFIG=y make install
Expand Down
Binary file added snapshots/bash-5.1.16.tar.gz
Binary file not shown.
3 changes: 0 additions & 3 deletions snapshots/bash-v5.1.16.zip

This file was deleted.

3 changes: 0 additions & 3 deletions snapshots/bash-v5.1.4.zip

This file was deleted.

Binary file added snapshots/ipmitool-1.8.18.tar.gz
Binary file not shown.
Binary file added snapshots/iptables-1.8.8.tar.bz2
Binary file not shown.
3 changes: 0 additions & 3 deletions snapshots/iptables-v1.8.7.zip

This file was deleted.

Binary file added snapshots/iputils-20211215.tar.gz
Binary file not shown.
3 changes: 0 additions & 3 deletions snapshots/iputils-v20210722.zip

This file was deleted.

Binary file added snapshots/nvme-cli-2.0.tar.gz
Binary file not shown.
3 changes: 0 additions & 3 deletions snapshots/nvme-cli-v1.14.zip

This file was deleted.

3 changes: 0 additions & 3 deletions snapshots/nvme-cli-v1.16.zip

This file was deleted.

3 changes: 0 additions & 3 deletions snapshots/rasdaemon-v0.6.7.zip

This file was deleted.

Binary file added snapshots/smartmontools-7.3.tar.gz
Binary file not shown.
3 changes: 0 additions & 3 deletions snapshots/smartmontools-v7.2.zip

This file was deleted.

Binary file added snapshots/util-linux-2.38.tar.gz
Binary file not shown.
3 changes: 0 additions & 3 deletions snapshots/util-linux-v2.37.4.zip

This file was deleted.

11 changes: 6 additions & 5 deletions telegraf-intel-docker.sh
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,12 @@ DOCKER_CONTAINER_NAME=$3
DOCKER_DPDK_SOCKET_PATH="/var/run/dpdk/rte"
DOCKER_PMU_EVENTS_PATH="/var/cache/pmu"

readonly TELEGRAF_VERSION='1.21.3-alpine'
readonly TELEGRAF_VERSION='1.24.3-alpine'

readonly DOCKER_IMAGE_NAME=$2
readonly DOCKER_IMAGE_TAG='1.1'
readonly DOCKER_TELEGRAF_BUILD_IMAGE='telegraf:1.21.3-alpine'
readonly DOCKER_TELEGRAF_FINAL_BASE_IMAGE='alpine:3.15.0'
readonly DOCKER_IMAGE_TAG='1.2.0'
readonly DOCKER_TELEGRAF_BUILD_IMAGE="telegraf:${TELEGRAF_VERSION}"
readonly DOCKER_TELEGRAF_FINAL_BASE_IMAGE='alpine:3.16'

readonly CONTAINER_MEMORY_LIMIT=200m
readonly CONTAINER_CPU_SHARES=512
Expand All @@ -32,8 +32,9 @@ fi
# Build docker image.
function build_docker() {
echo -e "${GREEN}Building Telegraf Docker image...${NO_COLOR}"
echo -e "Using telegraf version: ${TELEGRAF_VERSION}. Final base image: ${DOCKER_TELEGRAF_FINAL_BASE_IMAGE}"
echo "If this is your first time building image this might take a minute..."
docker build --build-arg TELEGRAF_TAG=$TELEGRAF_VERSION -f images/telegraf/Dockerfile . -t "$DOCKER_IMAGE_NAME":$DOCKER_IMAGE_TAG
docker build --build-arg TELEGRAF_TAG=$TELEGRAF_VERSION --build-arg FINAL_BASE_IMAGE=$DOCKER_TELEGRAF_FINAL_BASE_IMAGE -f images/telegraf/Dockerfile . -t "$DOCKER_IMAGE_NAME":$DOCKER_IMAGE_TAG
}

# Build and run docker container.
Expand Down
39 changes: 32 additions & 7 deletions telegraf/telegraf.conf
Original file line number Diff line number Diff line change
Expand Up @@ -302,15 +302,31 @@
# normalize_keys = ["snakecase", "trim", "lower", "underscore"]


# # Intel PowerStat plugin enables monitoring of platform metrics (power, TDP) and Core metrics like temperature, power and utilization.
# # Intel PowerStat plugin enables monitoring of platform metrics (power, TDP)
# # and per-CPU metrics like temperature, power and utilization.
# [[inputs.intel_powerstat]]
# ## All global metrics are always collected by Intel PowerStat plugin.
# ## User can choose which per-CPU metrics are monitored by the plugin in cpu_metrics array.
# ## Empty array means no per-CPU specific metrics will be collected by the plugin - in this case only platform level
# ## telemetry will be exposed by Intel PowerStat plugin.
# ## The user can choose which package metrics are monitored by the plugin with
# ## the package_metrics setting:
# ## - The default, will collect "current_power_consumption",
# ## "current_dram_power_consumption" and "thermal_design_power"
# ## - Leaving this setting empty means no package metrics will be collected
# ## - Finally, a user can specify individual metrics to capture from the
# ## supported options list
# ## Supported options:
# ## "cpu_frequency", "cpu_busy_frequency", "cpu_temperature", "cpu_c1_state_residency", "cpu_c6_state_residency", "cpu_busy_cycles"
# cpu_metrics = ["cpu_frequency", "cpu_busy_frequency", "cpu_temperature", "cpu_c1_state_residency", "cpu_c6_state_residency", "cpu_busy_cycles"]
# ## "current_power_consumption", "current_dram_power_consumption",
# ## "thermal_design_power", "max_turbo_frequency", "uncore_frequency"
# package_metrics = ["current_power_consumption", "current_dram_power_consumption", "thermal_design_power", "max_turbo_frequency", "uncore_frequency"]
#
# ## The user can choose which per-CPU metrics are monitored by the plugin in
# ## cpu_metrics array.
# ## Empty or missing array means no per-CPU specific metrics will be collected
# ## by the plugin.
# ## Supported options:
# ## "cpu_frequency", "cpu_c0_state_residency", "cpu_c1_state_residency",
# ## "cpu_c6_state_residency", "cpu_busy_cycles", "cpu_temperature",
# ## "cpu_busy_frequency"
# ## ATTENTION: cpu_busy_cycles is DEPRECATED - use cpu_c0_state_residency
# cpu_metrics = ["cpu_frequency", "cpu_c0_state_residency", "cpu_c1_state_residency", "cpu_c6_state_residency", "cpu_temperature", "cpu_busy_frequency"]


# # Read metrics from the bare metal servers via IPMI
Expand Down Expand Up @@ -377,6 +393,15 @@
## Read the plugin documentation for more information.
chains = [ "INPUT" ]

# Gathers huge pages measurements.
[[inputs.hugepages]]
## Supported huge page types:
## - "root" - based on root huge page control directory:
## /sys/kernel/mm/hugepages
## - "per_node" - based on per NUMA node directories:
## /sys/devices/system/node/node[0-9]*/hugepages
## - "meminfo" - based on /proc/meminfo file
# types = ["root", "per_node"]

# Get kernel statistics from /proc/vmstat
[[inputs.kernel_vmstat]]
Expand Down

0 comments on commit c092e63

Please sign in to comment.