Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MMDB writer consuming a lot of memory #90

Open
mandar-01 opened this issue Jul 23, 2024 · 3 comments
Open

MMDB writer consuming a lot of memory #90

mandar-01 opened this issue Jul 23, 2024 · 3 comments

Comments

@mandar-01
Copy link

mandar-01 commented Jul 23, 2024

Hi,

I have been using the mmdbwriter package in golang to insert some records into an MMDB file. I observed that the script to insert records into MMDB consumes a lot of memory. I did some memory profiling using pprof and I found that a couple of MMDB writer functions were consuming most of the memory. I have attached a screenshot below for reference.

Screenshot 2024-07-22 at 6 15 53 PM

Screenshot 2024-07-22 at 6 17 56 PM

Screenshot 2024-07-22 at 6 18 37 PM

These functions consume around 600MB each whereas the size of the final MMDB file was 146MB. Overall, the program consumed around 3.8GB for producing a file of 146MB. I think these functions, specially the Map.Copy(), stores the records in-memory and is not garbage collected since their references are still in use. This profiling is done just before the writer writes the MMDB file on disk.

Here's how I have defined the MMDB writer:

writer, err := mmdbwriter.New(
	mmdbwriter.Options{
		DatabaseType:            "V1",
		IncludeReservedNetworks: true,
		RecordSize:              32,
	},
)

I am using the DeepMergeWith inserter to insert the MMDB records

@oschwald
Copy link
Member

oschwald commented Jul 23, 2024

It is expected that the writer will use a fair bit of memory. You don't provide any information on how you are using the writer, but in terms of what you have provided:

  • Map.Copy - this suggests you are using one of the merging inserter functions. You can likely reduce your memory usage by either using inserter.ReplaceWith or writing your own inserter function. Implementing your own function allows for much more efficient merging as you know what the records look like and how they might change. We only use the pre-defined functions internally for the simplest of cases. The default functions probably do have room for improvement, but that has not been a priority as we rarely use them.
  • insert - excluding the inserter function, the remainder of this is the in-memory representation of the tree. This will be much larger than the on-disk representation will be between 24-32 bits for a record and the in-memory representation would be over an order of magnitude more. Although this could be improved somewhat, it would likely come at the cost of slowing down reading and writing to the tree and making the code more complex.

@mandar-01
Copy link
Author

Thanks. Yes you are right, I am using the DeepMergeWith inserter. Updated the comment and added details about how I have defined the writer.

@oschwald
Copy link
Member

Looking at the code, I think it would be possible to get rid of the Copy in DeepMergeWith and more carefully allocate a new map only when needed. It is hard to know if this would significantly impact your memory usage as it would largely depend on the structure of your data and how it is modified on insert.

We don't have a single internal use of DeepMergeWith. I don't know if this is a change that we are likely to work on given that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants