Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: journald logs on vector #648

Merged
merged 4 commits into from
Sep 30, 2024
Merged

feat: journald logs on vector #648

merged 4 commits into from
Sep 30, 2024

Conversation

outerlook
Copy link
Contributor

@outerlook outerlook commented Sep 27, 2024

Description

  • update prod and dev compose, include needed config

Related Problem

How Has This Been Tested?

  • run the compose and you should see journald logs if priority is 0-4. You may use 5 if you want a fast test

Notes

  • probably the current configuration won't get everything we eventually need from the servers, but I guess it's faster if we deploy and modify it as we go

Summary by CodeRabbit

  • New Features

    • Enhanced documentation for better clarity and detail on deployment components.
    • Introduced new logging capabilities, including a console sink for logs and a dedicated logs collection from journald.
    • Expanded metrics collection to include broader input patterns for both development and production environments.
  • Bug Fixes

    • Updated service configurations to ensure proper access to system logs and improved functionality.
  • Documentation

    • Reformatted and detailed the README for improved readability and understanding.

This update adds support for capturing systemd logs via journald in both production and development configurations. Key changes include defining new log source and destination in Vector configurations, switching to Debian-based Vector images for compatibility, and mounting necessary directories for journald access. Additionally, console output has been configured for the development environment to aid in debugging.
Altered `vector-sources.yml` to capture logs with priority 1 to 4 for more critical log levels. Expanded `vector-dev-destination.yml` to use wildcard inputs, allowing for flexible metric and log aggregation.
Modified datadog-destination and datadog-logs inputs in vector-prod-destination.yml. This change replaces specific input names with wildcard patterns to ensure all relevant metrics and logs are captured.
Copy link

coderabbitai bot commented Sep 27, 2024

Walkthrough

The pull request introduces several updates to the deployment configurations for the Vector service, enhancing its logging capabilities. Key changes include the transition to a Debian-based image for better compatibility with journald, the addition of new volume mounts for accessing system logs, and the introduction of new sinks for collecting logs and metrics. Documentation has also been improved for clarity regarding the deployment components and their functionalities.

Changes

Files Change Summary
deployments/observer/README.md Documentation updated for clarity, including expanded descriptions of components, reformatting to bullet points, and a new section on logs collection.
deployments/observer/dev-observer-compose.yml Configuration updates for vector, prometheus, and grafana services, including specific image version, user permissions, and volume mounts.
deployments/observer/observer-compose.yml Updated vector service configuration to use a Debian image, added user permissions, and new volume mounts for system logs.
deployments/observer/vector-dev-destination.yml Modified prometheus sink to accept multiple metrics and added a new console sink for log output.
deployments/observer/vector-prod-destination.yml Introduced a new datadog-logs sink and updated the datadog-destination sink to accept multiple metrics.
deployments/observer/vector-sources.yml Renamed hostmetrics to out-metrics-hostmetrics and added a new source for journald logs.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Vector
    participant Datadog
    participant Journald

    User->>Vector: Send metrics and logs
    Vector->>Journald: Collect logs
    Vector->>Datadog: Forward metrics and logs
    Datadog-->>User: Provide monitoring data
Loading

Assessment against linked issues

Objective Addressed Explanation
Set up Vector to collect host logs from multiple hosts (#572)

Possibly related PRs

  • feat: dev and prod observer configurations #631: The changes in this PR involve configurations for the vector service, which directly relates to the updates made in the dev-observer-compose.yml and observer-compose.yml files in the main PR, particularly regarding the service's image and logging capabilities.

Suggested reviewers

  • MicBun

Poem

In the meadow where the logs do flow,
A rabbit hops with joy, you know!
Vector's set to gather all,
Metrics and logs, it heeds the call.
With Debian's strength, it leaps so high,
To Datadog, our data will fly! 🐇✨


🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@outerlook outerlook self-assigned this Sep 27, 2024
@outerlook outerlook marked this pull request as ready for review September 27, 2024 12:36
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (11)
deployments/observer/vector-dev-destination.yml (3)

5-9: LGTM! Consider adding a comment for clarity.

The change from a specific input to a wildcard pattern (out-metrics-*) is a good improvement. It allows for more flexibility in capturing metrics from multiple sources, which aligns well with the PR objective of enhancing logging capabilities and collecting logs from multiple hosts.

Consider adding a comment explaining the purpose of the wildcard pattern. For example:

inputs:
  # Capture metrics from all 'out-metrics-*' sources
  - out-metrics-*

This would improve the configuration's readability and maintainability.


11-17: Great addition! Consider adding flush_period_secs for consistency.

The addition of a console sink for raw log output is an excellent improvement. It will be very useful for debugging and monitoring purposes. The use of logfmt encoding and the wildcard pattern for inputs aligns well with the PR objectives.

For consistency with the prometheus sink, consider adding a flush_period_secs setting to the console sink. This would ensure that logs are flushed at regular intervals. For example:

console:
  type: console
  encoding:
    codec: logfmt
  inputs:
    - out-logs-*
  flush_period_secs: 60  # Add this line

This would make the behavior of both sinks more consistent and predictable.


1-17: Overall, excellent improvements to the Vector configuration!

The changes made in this file significantly enhance the logging capabilities of the system, aligning well with the PR objectives. The use of wildcard patterns for inputs in both the prometheus and console sinks allows for more flexible and comprehensive log collection from multiple sources.

To further improve the configuration, consider adding a comment at the top of the file explaining the purpose of this configuration and how it relates to the overall logging strategy. For example:

# Vector configuration for development environment
# This configuration sets up prometheus metrics export and console log output
# for comprehensive logging and monitoring across multiple hosts.

This would provide valuable context for anyone working with this configuration in the future.

deployments/observer/vector-prod-destination.yml (1)

Line range hint 1-17: Consider adding explanatory comments for each sink.

While the configuration looks good, it would be beneficial to add comments explaining the purpose and scope of each sink. This would improve maintainability and make it easier for other developers to understand the configuration.

Consider adding comments like these:

sinks:
  # Collect and forward various system metrics to Datadog
  datadog-destination:
    # ... (existing configuration)

  # Collect and forward system logs to Datadog
  datadog-logs:
    # ... (existing configuration)
deployments/observer/vector-sources.yml (1)

Line range hint 4-14: Approved: Good renaming for clarity.

The renaming of the source from hostmetrics to out-metrics-hostmetrics improves clarity by indicating its purpose (output) and type (metrics). The existing configuration is maintained, which is good for consistency.

Consider updating the comment on line 14 to be more specific:

-      # - cgroups # we don't need it discriminated here
+      # - cgroups # Excluded as we don't need granular cgroup metrics here

This provides a clearer explanation of why the cgroups collector is commented out.

deployments/observer/observer-compose.yml (3)

6-6: Approved: Image update for journald support

The change to a Debian-based image is appropriate for enabling journald input support. This aligns well with the PR objective of collecting journald logs.

Consider updating the comment to be more specific:

-    image: timberio/vector:0.41.1-debian # debian is needed for journald input
+    image: timberio/vector:0.41.1-debian # Debian-based image required for journald input support

12-15: Approved: New volume mounts for journald access

The addition of read-only mounts for /run/log/journal, /var/log/journal, and /etc/machine-id is appropriate for accessing the host's systemd journal. This directly supports the PR objective of collecting journald logs.

Consider updating the comment to be more specific and include information about each mount:

-      # we want to mount the journal and machine-id to get logs from systemd from host
+      # Mount host's journal directories and machine-id for systemd log access
       - /run/log/journal:/run/log/journal:ro
       - /var/log/journal:/var/log/journal:ro
-      - /etc/machine-id:/etc/machine-id:ro
+      - /etc/machine-id:/etc/machine-id:ro  # Required for unique host identification

Line range hint 1-19: Summary: Configuration updates for journald log collection

The changes made to the Vector service configuration in this file successfully address the PR objective of enabling journald log collection. Key improvements include:

  1. Updating to a Debian-based image for journald support.
  2. Running the service as root to access system logs.
  3. Adding necessary volume mounts for accessing the host's systemd journal.

These changes are well-aligned with the goal of enhancing log collection capabilities. However, there are a few points to consider:

  1. The security implications of running the container as root should be carefully evaluated and mitigated.
  2. Ensure that the Datadog configuration is correct and securely managed.

Recommendation: After addressing the suggestions in the previous comments, consider adding a comment at the top of the file explaining the recent changes and their purpose. This will help future maintainers understand the rationale behind these configuration updates.

deployments/observer/README.md (2)

28-28: LGTM: Improved clarity on Vector's role in production.

The updated description accurately reflects Vector's expanded capabilities in the production environment, now including both metrics and logs. The new Components subsection provides additional clarity.

Consider adding a brief mention of Datadog in the Components subsection for completeness, as it's the destination for the metrics and logs. For example:

 ### Components

 - **Vector**: Collects host metrics and logs, sends them to Datadog
+- **Datadog**: Receives and processes metrics and logs from Vector

Also applies to: 36-38


40-42: LGTM: New section on Logs Collection adds valuable information.

The new Logs Collection section directly addresses the PR objectives and provides clear information about Vector's configuration for collecting and forwarding logs.

Consider adding more detailed information about log priority levels as mentioned in the PR objectives. For example:

 ### Logs Collection

 - **Journald**: Vector is configured to collect logs from `journald` and forward them to Datadog logs in production, or to console in development.
+- **Log Priority**: By default, logs with priority levels between 0 and 4 are collected. For quicker testing, priority level 5 can be used.

This addition would provide users with important information about the log collection configuration and testing options.

deployments/observer/dev-observer-compose.yml (1)

36-36: Approved: Explicit volume configuration and file formatting

The change to the volumes configuration is a good practice:

  1. Explicitly setting grafana-storage: null (line 36) clearly indicates the use of the default driver without options.

However, there's a minor formatting issue:

  1. The file is missing a newline character at the end, as indicated by the static analysis tool.

Please add a newline character at the end of the file to comply with YAML best practices and resolve the linter warning.

🧰 Tools
🪛 yamllint

[error] 36-36: no new line character at the end of file

(new-line-at-end-of-file)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between f122136 and c3a798c.

📒 Files selected for processing (6)
  • deployments/observer/README.md (1 hunks)
  • deployments/observer/dev-observer-compose.yml (1 hunks)
  • deployments/observer/observer-compose.yml (1 hunks)
  • deployments/observer/vector-dev-destination.yml (1 hunks)
  • deployments/observer/vector-prod-destination.yml (1 hunks)
  • deployments/observer/vector-sources.yml (2 hunks)
🧰 Additional context used
📓 Learnings (2)
deployments/observer/vector-dev-destination.yml (1)
Learnt from: outerlook
PR: truflation/tsn#631
File: deployments/observer/vector-prod-destination.yml:1-9
Timestamp: 2024-09-24T17:19:04.301Z
Learning: In the Vector configuration for Datadog metrics sink, the options `tags` and `metric_type` do not exist.
deployments/observer/vector-prod-destination.yml (1)
Learnt from: outerlook
PR: truflation/tsn#631
File: deployments/observer/vector-prod-destination.yml:1-9
Timestamp: 2024-09-24T17:19:04.301Z
Learning: In the Vector configuration for Datadog metrics sink, the options `tags` and `metric_type` do not exist.
🪛 LanguageTool
deployments/observer/README.md

[uncategorized] ~32-~32: Loose punctuation mark.
Context: ...vironment Variables - DATADOG_API_KEY: Datadog API key - DATADOG_NAMESPACE: ...

(UNLIKELY_OPENING_PUNCTUATION)

🪛 yamllint
deployments/observer/dev-observer-compose.yml

[error] 36-36: no new line character at the end of file

(new-line-at-end-of-file)

🔇 Additional comments (11)
deployments/observer/vector-prod-destination.yml (2)

11-17: LGTM! Verify the wildcard pattern for logs collection.

The addition of the datadog-logs sink aligns well with the PR objective of collecting journald logs. The use of the out-logs-* pattern allows for flexible log source collection, and the consistent use of environment variables maintains good security practices.

To ensure this new sink captures all intended log sources and no unintended ones, please verify the naming convention of your log sources. Run the following command to check for log source definitions:

✅ Verification successful

To ensure the shell script runs correctly and accurately captures all relevant log source definitions, please execute the following corrected script:


Wildcard pattern for logs collection is consistent with existing configurations.

The out-logs-* pattern in vector-prod-destination.yml aligns with the current logging setup, effectively capturing all intended log sources without introducing unintended matches.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for log source definitions that match the new wildcard pattern.

# Test: Search for log source definitions. Expect: Only intended log sources.
rg --type yaml 'out-logs-' -g '!deployments/observer/vector-prod-destination.yml'

Length of output: 303


Script:

#!/bin/bash
# Description: Check for log source definitions that match the new wildcard pattern.

# Test: Search for log source definitions. Expect: Only intended log sources.
rg --type yaml 'out-logs-' -g '!deployments/observer/vector-prod-destination.yml'

Length of output: 209


6-9: LGTM! Verify the wildcard pattern for metrics collection.

The change from hostmetrics to out-metrics-* allows for collecting a broader range of metrics, which aligns with the PR objective of enhancing logging capabilities. The use of environment variables for sensitive information is a good security practice.

To ensure this change doesn't unintentionally capture unwanted metrics, please verify the naming convention of your metric sources. Run the following command to check for any unexpected metric sources:

✅ Verification successful

Verified! The wildcard pattern out-metrics-* correctly captures only the intended out-metrics-hostmetrics source without introducing unintended metrics. The configuration follows best practices for security and specificity.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for unexpected metric sources that match the new wildcard pattern.

# Test: Search for metric source definitions. Expect: Only intended metric sources.
rg --type yaml 'out-metrics-' -g '!deployments/observer/vector-prod-destination.yml'

Length of output: 303


Script:

#!/bin/bash
# Description: Check for unexpected metric sources that match the new wildcard pattern.

# Test: Search for metric source definitions. Expect: Only intended metric sources.
rg --type yaml 'out-metrics-' -g '!deployments/observer/vector-prod-destination.yml'

Length of output: 221


Script:

#!/bin/bash
# Description: Verify the configuration of 'out-metrics-hostmetrics' in vector-sources.yml.

# Check the specific configuration for out-metrics-hostmetrics
cat deployments/observer/vector-sources.yml | grep 'out-metrics-hostmetrics'

Length of output: 103


Script:

#!/bin/bash
# Description: Extract the full configuration of 'out-metrics-hostmetrics' in vector-sources.yml.

# Extract the block of 'out-metrics-hostmetrics' for detailed inspection
rg -A 5 'out-metrics-hostmetrics:' deployments/observer/vector-sources.yml

Length of output: 310

deployments/observer/vector-sources.yml (1)

Line range hint 1-21: Summary: Changes align well with PR objectives.

The modifications to this file successfully implement the required updates for journald log collection, addressing the linked issue #572. The renaming of the existing source improves clarity, while the addition of the new out-logs-journald source with appropriate configuration enables the collection of host logs as intended.

The priority filtering (levels 1-4) aligns with the PR objectives, though a minor clarification was suggested in the comments. Overall, these changes contribute to the enhanced logging capabilities outlined in the PR summary.

To ensure these changes are properly reflected in the Docker Compose configurations mentioned in the PR objectives, please run the following verification script:

This script will help confirm that the updated Vector configuration is properly integrated into both the production and development environments.

✅ Verification successful

Vector Configuration References Confirmed in Docker Compose Files

All Docker Compose files appropriately reference vector-sources.yml, specifically in:

  • deployments/observer/observer-compose.yml
  • deployments/observer/dev-observer-compose.yml

This confirms that the Vector configuration is properly integrated into both production and development environments.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify that the Vector configuration is correctly referenced in Docker Compose files

# Test: Check if the vector-sources.yml file is referenced in the Docker Compose files
# Expect: References to vector-sources.yml in both production and development Docker Compose files

echo "Checking Docker Compose files for vector-sources.yml reference:"
rg --type yaml 'vector-sources.yml' -g 'docker-compose*.yml'

Length of output: 346


Script:

#!/bin/bash
# Description: Broadly verify that the Vector configuration is correctly referenced in any YAML files

# Test: Search for references to vector-sources.yml in all YAML files
# Expect: At least one reference found in Docker Compose or related configuration files

echo "Searching for references to vector-sources.yml in all YAML files:"
rg --type yaml 'vector-sources.yml'

Length of output: 406

deployments/observer/observer-compose.yml (2)

Line range hint 16-19: Verify Datadog configuration

The Datadog-related environment variables are correctly maintained in the configuration. This ensures that metrics can be sent to Datadog for monitoring purposes, which is crucial for the Observer setup in production.

Please confirm that:

  1. The DATADOG_API_KEY is being securely managed and not exposed in the repository.
  2. The DATADOG_NAMESPACE is correctly set for your environment.
  3. The DATADOG_ENDPOINT is pointing to the correct Datadog intake server.

You can use the following script to check for any hardcoded Datadog configurations:

#!/bin/bash
# Description: Check for any hardcoded Datadog configurations

# Test: Search for any hardcoded Datadog API keys or endpoints
rg --type yaml -i '(datadog.*api.*key|datadog.*endpoint)'

Ensure that these variables are properly documented in your project's documentation or README file.


8-8: Approved with caution: Root user for journalctl access

Running the service as root is necessary for reading journalctl, which aligns with the PR objectives. However, this introduces potential security risks.

To ensure this change is necessary and to explore potential alternatives:

Consider implementing additional security measures:

  1. Use the --cap-drop ALL option to drop all capabilities and then add only the necessary ones.
  2. Implement a custom AppArmor or SELinux profile for this container.
  3. Ensure that the container is running in its own user namespace.

Please add a comment explaining the security implications and any mitigations implemented.

deployments/observer/README.md (2)

17-19: LGTM: Improved readability and accuracy of component descriptions.

The use of bullet points enhances readability, and the expanded description of Vector accurately reflects its new capabilities for collecting both host metrics and logs. This aligns well with the PR objectives.


23-24: LGTM: Consistent formatting for improved readability.

The use of bullet points in the Ports section maintains consistency with the Components section, enhancing overall document readability.

deployments/observer/dev-observer-compose.yml (4)

20-20: Approved: Minor syntax improvements for Prometheus

The changes to the Prometheus service configuration are minor syntax improvements:

  1. Updated command syntax to use single quotes (line 20).
  2. Updated port mapping syntax to use single quotes (line 24).

These changes improve consistency without altering functionality.

Also applies to: 24-24


29-29: Approved: Minor syntax improvement for Grafana

The change to the Grafana service configuration is a minor syntax improvement:

  1. Updated port mapping syntax to use single quotes (line 29).

This change improves consistency without altering functionality.


Line range hint 1-36: Summary: Successfully implemented journald log collection for Vector

This PR successfully updates the development Docker Compose configuration to enable journald log collection, addressing issue #572. Key improvements include:

  1. Vector service now uses a Debian-based image for journald support.
  2. Added necessary volume mounts for accessing host journal logs and machine ID.
  3. Updated configurations for Prometheus and Grafana for consistency.

These changes align well with the PR objectives and should allow for collecting logs with priority levels between 0 and 4, with the option to use level 5 for testing.

To ensure the changes are complete, let's verify the Vector configuration:

#!/bin/bash
# Description: Verify Vector configuration for journald input and log filtering

# Test: Check for journald source configuration and log filtering
echo "Checking Vector configuration:"
rg --type yaml -A 10 'journald' ./vector-sources.yml

This script will help confirm that the Vector configuration includes the journald source and appropriate log filtering settings.

🧰 Tools
🪛 yamllint

[error] 36-36: no new line character at the end of file

(new-line-at-end-of-file)


6-8: Improvements for Vector configuration, but consider security implications

The changes to the Vector service configuration align well with the PR objectives:

  1. Using a Debian-based image (line 6) is appropriate for journald support.
  2. The updated command syntax (line 7) looks correct.
  3. The new volume mounts (lines 13-15) are essential for collecting host logs from systemd.

However, there's a potential security concern:

  1. Running as root (line 8) is necessary for accessing journalctl, but it's generally a security risk. Consider if it's possible to use a less privileged user with specific capabilities instead.

To ensure these changes are consistent with the Vector configuration, let's check the Vector configuration files:

Also applies to: 13-15

@MicBun MicBun merged commit 5bb8364 into main Sep 30, 2024
10 checks passed
@MicBun MicBun deleted the feat/vector-host-logs branch September 30, 2024 02:18
@coderabbitai coderabbitai bot mentioned this pull request Nov 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Problem: Vector host logs collection not set up
2 participants