Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set max size for maxBufferSize #196

Open
bwidermeli opened this issue Dec 17, 2020 · 16 comments
Open

Set max size for maxBufferSize #196

bwidermeli opened this issue Dec 17, 2020 · 16 comments

Comments

@bwidermeli
Copy link

If you have the metric example with value 1
It gets pushed this way:
e = 1
ex = 1
exa = 1
exam = 1
examp = 1
exampl = 1
example = 1

I am not sure if this happens due to a problem on the lib or you might know why this is happening, I saw that it might be when the message gets converted into a Buffer, but I don't think the problem is right there, I have been trying to replicate the issue locally with no luck, so I cannot debug it and do not know where it is actually happening.
Right now we are having thousands of USD on unused metrics on Datadog due to this. For example:
image

You can see the issue on the inconsistency tag key or even in the scope or other keys.

@bdeitte
Copy link
Collaborator

bdeitte commented Dec 17, 2020

Hi @bwidermeli I have never seen an issue opened on this (and definitely haven't seen this in the hot-shots usage I've had). Really interesting. Could you share some of your hot-shots settings? Most interested in the 'protocol' and if you have tried changing that.

@bwidermeli
Copy link
Author

Sure!
It is indeed an interesting issue. Here are the following implementation that we have in a wrapper of hot-shots.

// Just an example of the defaultConfig object
const defaultConfig = {
  host: 'datadog',
  prefix: `test.app.`,
  maxBufferSize: 10000,
  mock: false,
  globalTags: {
    platform: 'prTest',
    application: 'test-app',
    scope: 'pre-prod',
  },
};

// The following sanitizes the project config to ignore what we won't use:
const projectConfig = {};
const {
  globalTags: projectTags
} = projectConfig;

// Next we'll merge the project config with the default one
const config = {
  ...defaultConfig,
  ...projectValidConfig,
  globalTags: {
    ...projectTags,
    ...defaultConfig.globalTags, // Has precedence over the project tags so they can't override them
  },
};

/**
 * Create a new StatsD client instance
 */
const statsDClient = new StatsD(config);

So, we actually have no protocol settings yet as we were using the [email protected] version and we are just migrating now to the 8.6.1. Any ideas?

@bwidermeli
Copy link
Author

Even when we changed to the updated version, we had no need of setting the protocol though

@bwidermeli
Copy link
Author

Just tried deploying a version with protocol set to 'uds' and the problem is the same:
image

@bdeitte
Copy link
Collaborator

bdeitte commented Dec 18, 2020

Hmm. Not sure if anybody else has seen this and I'm still stumped. Anything unusual about your DogStatsD usage, have you asked Datadog about that? Also, anything unusual in how you call hot-shots- is it some kind of indirect or functional programming call where it could be accidentally called a bunch of times? As you can see from my questions, I'm wondering about things outside of hot-shots. Outside of adding debug tracing to see the calls going out, not sure what else can be done in this library for this.

@bwidermeli
Copy link
Author

bwidermeli commented Dec 18, 2020 via email

@bwidermeli
Copy link
Author

bwidermeli commented Dec 18, 2020

Hey Brian, I just checked all this again and I can confirm that the issue is actually here on the hot-shots library, I made a whole complete new implementation on a package, replacing hot-shots and adding an implementation of node-dogstatsd but updating it locally to have all the implementation from hot-shots and it just stops failing, you can see an example on the following screenshot:
image

Basically with hot-shots that just keeps happening as you saw in the image showing the serviceaction: 1getrecentactivities
Even though I still have no clue on where the issue could be, but with this I wanted to discard if it was an issue on the implementation from our side or from Datadog, and indeed it was not a problem on the implementation, any clue?

My only concerns is if it could be an issue on the buffer with the queue logic or maybe on the callback that receives the remaining Bytes, and maybe it is something there?

Thanks!

@bwidermeli
Copy link
Author

Update: Just found the issue, looks like it's a problem with the bufferHolder, when the maxBufferSize is set to 10000 for example with no bufferInterval, we have this problem.
I still don't know what the solution would be, but at least we know now that the issue comes from the buffer, when it gets enqueued.

@bdeitte
Copy link
Collaborator

bdeitte commented Dec 21, 2020

Well that's interesting! Sorry a bit intermittent here for awhile so may be slower to respond. Did changing settings work to fix it? I'd still want to keep this issue open to be clear, until we have a test case and fix in there for the combination, but wanted to make sure that does make it fine.

@bwidermeli
Copy link
Author

bwidermeli commented Dec 21, 2020 via email

@bwidermeli
Copy link
Author

Hello Brian!
So, we just found the issue, looks like the problem is that the Datadog Agent has a maximum size buffer of 8k bytes to be received, and we were sending about 10k, there is no place where they actually specify this, so we discovered this on the code and there is actually an issue where they say "8k bytes is more than enough!".
Maybe this could be added in the hot-shots documentation or a warning or something so nobody actually sets a value more than 8192 in the maxBufferSize.
The max size on the UDP protocol is 65k but they setted that value to 8k.

@bdeitte bdeitte changed the title Tags get pushed multiple times when multiple requests on the same application asynchronously Set max size for maxBufferSize Jan 2, 2021
@bdeitte
Copy link
Collaborator

bdeitte commented Jan 2, 2021

@bwidermeli Well that's good to know, glad you figured it out. I switched the title of this bug now and will keep it open until the following is done: when using udp the maxBufferSize switches back to size 8192 if set higher than this value with a console log as to why, and the README to updated to indicate the max value allowed

@bwidermeli
Copy link
Author

Sounds great Brian! Thanks a lot for taking on this!

@xzyfer
Copy link

xzyfer commented Jan 13, 2022

From the official agent docs the agent buffer size is configurable.

If you are using a community-supported DogStatsD client that supports buffering, make sure to configure a max datagram size that does not exceed the Agent-side per-datagram buffer size (8KB by default, configurable on the Agent with dogstatsd_buffer_size) and the network/OS max datagram size.

@bwidermeli
Copy link
Author

@xzyfer Yes, they added that after we (Mercado Libre) had a really long talk with the whole team because of this. And we also added a fixed 8KB to the buffer_size on a wrapper we use on the whole company, so this would never happen again in our projects at least.

@ehaynes99
Copy link

This isn't safe as currently implemented because it's comparing the length of strings, which is NOT the same as comparing bytes. If the characters used happen to fall within the ascii range, these will be the same, but each utf-8 character could be up to 4 bytes, making the max safe value 2,048.

This should hold the buffer as a Buffer rather than waiting until it's about to send. I need to do some testing with it, but something like this:
https://github.com/ehaynes99/hot-shots/commit/89f899020c2254cd45076202a094c4b87e833377

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants