Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Log memory usage of the agent and the mapper #2693

Merged

Conversation

didier-wenzek
Copy link
Contributor

@didier-wenzek didier-wenzek commented Feb 8, 2024

Proposed changes

Log memory usage of the agent and the mapper.
The default interval is of 60 seconds.

This can be changed using the --log-memory-interval flag which sets the logging interval in seconds.

$ target/debug/tedge-mapper --log-memory-interval 5 c8y
$ sudo target/debug/tedge-agent --log-memory-interval 5

Types of changes

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Improvement (general improvements like code refactoring that doesn't explicitly fix a bug or add any new functionality)
  • Documentation Update (if none of the other choices apply)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Paste Link to the issue

#2692 (comment)

Checklist

  • I have read the CONTRIBUTING doc
  • I have signed the CLA (in all commits with git commit -s)
  • I ran cargo fmt as mentioned in CODING_GUIDELINES
  • I used cargo clippy as mentioned in CODING_GUIDELINES
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if appropriate)

Further comments

Copy link

codecov bot commented Feb 8, 2024

Codecov Report

Attention: 55 lines in your changes are missing coverage. Please review.

Comparison is base (449f42b) 76.9% compared to head (e783cfb) 76.9%.
Report is 1 commits behind head on main.

Additional details and impacted files
Files Coverage Δ
.../tedge_config/src/tedge_config_cli/tedge_config.rs 81.1% <75.0%> (-0.1%) ⬇️
crates/core/tedge_mapper/src/lib.rs 0.0% <0.0%> (ø)
crates/common/tedge_config/src/lib.rs 0.0% <0.0%> (ø)
crates/core/tedge_mapper/src/main.rs 6.2% <8.3%> (+6.2%) ⬆️
crates/core/tedge_agent/src/main.rs 6.2% <7.6%> (+6.2%) ⬆️
crates/core/tedge/src/main.rs 45.5% <4.0%> (-15.9%) ⬇️

... and 3 files with indirect coverage changes

Copy link
Contributor

github-actions bot commented Feb 8, 2024

Robot Results

✅ Passed ❌ Failed ⏭️ Skipped Total Pass % ⏱️ Duration
391 0 3 391 100 55m28.415999999s

crates/core/tedge_agent/src/lib.rs Outdated Show resolved Hide resolved
@@ -56,6 +56,7 @@ c8y_http_proxy = { path = "crates/extensions/c8y_http_proxy" }
c8y_log_manager = { path = "crates/extensions/c8y_log_manager" }
c8y_mapper_ext = { path = "crates/extensions/c8y_mapper_ext" }
camino = "1.1"
cap = "0.1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side commentary: Came across this sysinfo crate that covers a lot more parameters like CPU usage. Something we can consider in the future, for advanced monitoring.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose is different. Sysinfo observes the operating system, while here the point is to monitor the amount of memory actually allocated by tedge on the heap.


/// Interval at which the memory usage is logged (in seconds)
#[clap(long = "log-memory-interval", default_value = "60")]
pub log_memory_interval: u64,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a short term addition or something permanent? If permanent, any specific reason why this is a command-line argument and not tedge config setting (other than convenience and simplicity)? No objections here, but just trying to understand the rationale behind that choice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is to set this right when the process starts - ideally before any thin-edge processing.

I see this as a new feature. This can possibly be extended with:

  • the number of threads (currently we use the default i.e one thread per core),
  • a cap on allocated memory (crashing if the process starts to go wild).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Supporting a config would make it easier to enable on devices under test...as adding a flag is more difficult as we would need to use systemctl edit ... to enable this feature.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will give a try to a config setting. This is definitely more convenient. However, not so straightforward to implement.

Copy link
Contributor Author

@didier-wenzek didier-wenzek Feb 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Supporting a config would make it easier to enable on devices under test...as adding a flag is more difficult as we would need to use systemctl edit ... to enable this feature.

Done with 4d19e41

The new setting is run.log_memory_interval with a default of 5 minutes.


#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
async fn main() -> anyhow::Result<()> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to consider deprecating this main and enforcing the use of a tedge -agent as a symlink to tedge.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a ticket for that: #2696

Comment on lines +33 to +37
let tedge_config = tedge_config::load_tedge_config(&opt.config_dir)?;
block_on_with(
tedge_config.run.log_memory_interval.duration(),
tedge_mapper::run(opt),
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not very happy with that, i.e. loading the config without passing it to tedge_mapper::run().

Doing so is not straightforward as each of the actors used by the mapper and the agent have specific way to load the config, with or without a reference to the config-dir and other weirdness including duplicated paths to the same setting. It would be good to simplify all that in a follow up PR.

The default interval is of 60 seconds.

This can be changed using the --log-memory-interval flag
which sets the logging interval in seconds.

```
$ target/debug/tedge-mapper --log-memory-interval 5 c8y
$ sudo target/debug/tedge-agent --log-memory-interval 5
```

Signed-off-by: Didier Wenzek <[email protected]>
@didier-wenzek didier-wenzek force-pushed the feat/trace-memory-usage branch from 4d19e41 to e783cfb Compare February 9, 2024 17:27
Copy link
Contributor

@reubenmiller reubenmiller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@didier-wenzek didier-wenzek added this pull request to the merge queue Feb 9, 2024
Merged via the queue into thin-edge:main with commit 5ce44d6 Feb 9, 2024
20 checks passed
@didier-wenzek didier-wenzek deleted the feat/trace-memory-usage branch February 15, 2024 09:33
@reubenmiller reubenmiller added this to the 1.0.1 milestone Feb 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants