Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read /proc/net files with a single read syscall. #361

Merged
merged 2 commits into from
May 4, 2017

Conversation

ablagoev
Copy link
Contributor

@ablagoev ablagoev commented May 3, 2017

The /proc/net files are not guaranteed to be consistent, they are only
consitent on the row level. This is probably one of the reasons why
consequent read calls might return duplicate entries - the kernel is
changing the file as it is being read. In certain situations this might
lead to loop like situations - the same net entry is being returned by consequent read calls as new connections are added to the kernel tcp table.

This PR is trying to reduce the duplications, by fetching the contents
of the net files with a single read syscall.

This discussion on Stackoverflow goes into more detail on how the /proc files work in terms of consistency.

In our use case, there were certain situations where the netstat telegraf input plugin spiked eating up a lot of memory. On further inspection it appeared that this was due to a lot of entries in the /proc/net/tcp file. These entries, however, did not correspond to higher tcp connections - they remained within the expected range. This is what leads me to believe, that in certain situations loop like conditions are created and a lot of duplicate rows appear in /proc/net/tcp.

Consider the following graphs:

graphs

The memory in the graph is the memory taken from the netstat telegraf plugin. You can see how spikes in the /proc/net/tcp length correspond to spikes in netstat's memory usage. Additionally, in those spikes, there are no real spikes in tcp connection counts. The length of /proc/net/tcp was gathered separately using cat /proc/net/tcp. cat itself issues many read calls to the file, so it has the same behavior as the netstat plugin.

After the patch the graphs look like this:

graphafter

There are no more spikes in memory.

Slightly more CPU is consumed, but this is compensated with the reduction of duplicates and stabilization of the memory used. For smaller use cases I don't think CPU usage will be an issue, in higher ones overall single reads yields much better performance.
In our use case we are observing a 56% improvement in the time required to gather the netstat data:

netstatcomparison

The /proc/net files are not guaranteed to be consistent, they are only
consitent on the row level. This is probably one of the reasons why
consequent read calls might return duplicate entries - the kernel is
changing the file as it is being read. In certain situations this might
lead to loop like situations - the same net entry is being returned when
reading the file as new connections are added to the kernel tcp table, i.e
there can be a lot of duplications.

This commit is trying to reduce the duplications, by fetching the contents
of the net files with a single read syscall.
@shirou
Copy link
Owner

shirou commented May 4, 2017

So cool! Thank you. Could you add some comments about why not to use common.ReadLines
with this PR URL?

@shirou
Copy link
Owner

shirou commented May 4, 2017

Thank you for your contribution!

@shirou shirou merged commit f7c38fa into shirou:master May 4, 2017
apoorv007 added a commit to archsaber/gopsutil that referenced this pull request Jun 24, 2017
@zzyongx
Copy link

zzyongx commented Apr 10, 2019

ioutil.ReadFile is still slow on centos-6, when /proc/net/tcp is bigger than 30M. But it's not go's problem, use cat is slow too.

@Lomanic
Copy link
Collaborator

Lomanic commented May 28, 2019

@zzyongx see #695 for how gopsutil could improve performance using netlink API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants