-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Accelerated DHT Client causes OOM kill upon start of IPFS, ResourceMgr.MaxMemory ignored #9990
Comments
|
Could you please post an |
Ah. That's a bit misleading from the name of the setting and AFAIK not documented anywhere.
I can make another test run before it goes harakiri perhaps in about 3 days (once I'm done ingesting the 22 TB ipfs-search.com snapshot archive :p). |
*Kubo 😉 Need to rewrite everything into Zig before attempting that. You can use |
I've turned it on again and set Perhaps still notable though that without this, the accelerated DHT client is particularly memory hungry. Config {
"API": {
"HTTPHeaders": {
"Access-Control-Allow-Methods": [
"PUT",
"POST"
],
"Access-Control-Allow-Origin": [
"http://<private>:10125",
"http://localhost:3000",
"http://127.0.0.1:5001",
"https://webui.ipfs.io"
]
}
},
"Addresses": {
"API": "/ip4/0.0.0.0/tcp/5001",
"Announce": [],
"AppendAnnounce": [
"/dns4/<private>/tcp/4001",
"/dns4/<private>/udp/4001/quic",
"/dns4/<private>/udp/4001/quic-v1",
"/dns4/<private>/udp/4001/quic-v1/webtransport",
"/dns4/<private>/tcp/4001",
"/dns4/<private>/udp/4001/quic",
"/dns4/<private>/udp/4001/quic-v1",
"/dns4/<private>/udp/4001/quic-v1/webtransport"
],
"Gateway": "/ip4/0.0.0.0/tcp/8080",
"NoAnnounce": [
"/ip4/10.0.0.0/ipcidr/8",
"/ip4/100.64.0.0/ipcidr/10",
"/ip4/169.254.0.0/ipcidr/16",
"/ip4/172.16.0.0/ipcidr/12",
"/ip4/192.0.0.0/ipcidr/24",
"/ip4/192.0.2.0/ipcidr/24",
"/ip4/192.168.0.0/ipcidr/16",
"/ip4/198.18.0.0/ipcidr/15",
"/ip4/198.51.100.0/ipcidr/24",
"/ip4/203.0.113.0/ipcidr/24",
"/ip4/240.0.0.0/ipcidr/4",
"/ip6/100::/ipcidr/64",
"/ip6/2001:2::/ipcidr/48",
"/ip6/2001:db8::/ipcidr/32",
"/ip6/fc00::/ipcidr/7",
"/ip6/fe80::/ipcidr/10"
],
"Swarm": [
"/ip4/0.0.0.0/tcp/4001",
"/ip6/::/tcp/4001",
"/ip4/0.0.0.0/udp/4001/quic",
"/ip4/0.0.0.0/udp/4001/quic-v1",
"/ip4/0.0.0.0/udp/4001/quic-v1/webtransport",
"/ip6/::/udp/4001/quic",
"/ip6/::/udp/4001/quic-v1",
"/ip6/::/udp/4001/quic-v1/webtransport"
]
},
"AutoNAT": {},
"Bootstrap": [
"/dnsaddr/bootstrap.libp2p.io/p2p/QmNnooDu7bfjPFoTZYxMNLWUQJyrVwtbZg5gBMjTezGAJN",
"/dnsaddr/bootstrap.libp2p.io/p2p/QmQCU2EcMqAqQPR2i9bChDtGNJchTbq5TbXJJ16u19uLTa",
"/dnsaddr/bootstrap.libp2p.io/p2p/QmbLHAnMoJPWSCR5Zhtx6BHJX9KiKNN6tpvbUcqanj75Nb",
"/dnsaddr/bootstrap.libp2p.io/p2p/QmcZf59bWwK5XFi76CZX8cbJ4BhTzzA3gU1ZjYZcYW3dwt",
"/ip4/104.131.131.82/tcp/4001/p2p/QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ",
"/ip4/104.131.131.82/udp/4001/quic/p2p/QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ"
],
"DNS": {
"Resolvers": {}
},
"Datastore": {
"BloomFilterSize": 1048576,
"GCPeriod": "1h",
"HashOnRead": false,
"Spec": {
"mounts": [
{
"child": {
"path": "blocks",
"shardFunc": "/repo/flatfs/shard/v1/next-to-last/3",
"sync": false,
"type": "flatfs"
},
"mountpoint": "/blocks",
"prefix": "flatfs.datastore",
"type": "measure"
},
{
"child": {
"compression": "none",
"path": "datastore",
"type": "levelds"
},
"mountpoint": "/",
"prefix": "leveldb.datastore",
"type": "measure"
}
],
"type": "mount"
},
"StorageGCWatermark": 90,
"StorageMax": "100GB"
},
"Discovery": {
"MDNS": {
"Enabled": false
}
},
"Experimental": {
"AcceleratedDHTClient": true,
"FilestoreEnabled": true,
"GraphsyncEnabled": false,
"Libp2pStreamMounting": false,
"OptimisticProvide": false,
"OptimisticProvideJobsPoolSize": 0,
"P2pHttpProxy": false,
"StrategicProviding": false,
"UrlstoreEnabled": false
},
"Gateway": {
"APICommands": [],
"HTTPHeaders": {
"Access-Control-Allow-Headers": [
"X-Requested-With",
"Range",
"User-Agent"
],
"Access-Control-Allow-Methods": [
"GET"
],
"Access-Control-Allow-Origin": [
"*"
]
},
"NoDNSLink": false,
"NoFetch": false,
"PathPrefixes": [],
"PublicGateways": null,
"RootRedirect": ""
},
"Identity": {
"PeerID": "<nop>",
"PrivKey": "<no>"
},
"Internal": {},
"Ipns": {
"RecordLifetime": "",
"RepublishPeriod": "",
"ResolveCacheSize": 128
},
"Migration": {
"DownloadSources": [],
"Keep": ""
},
"Mounts": {
"FuseAllowOther": false,
"IPFS": "/ipfs",
"IPNS": "/ipns"
},
"Peering": {
"Peers": null
},
"Pinning": {
"RemoteServices": {}
},
"Plugins": {
"Plugins": null
},
"Provider": {
"Strategy": ""
},
"Pubsub": {
"DisableSigning": false,
"Router": ""
},
"Reprovider": {},
"Routing": {
"Methods": null,
"Routers": null
},
"Swarm": {
"AddrFilters": [
"/ip4/10.0.0.0/ipcidr/8",
"/ip4/100.64.0.0/ipcidr/10",
"/ip4/169.254.0.0/ipcidr/16",
"/ip4/172.16.0.0/ipcidr/12",
"/ip4/192.0.0.0/ipcidr/24",
"/ip4/192.0.2.0/ipcidr/24",
"/ip4/198.18.0.0/ipcidr/15",
"/ip4/198.51.100.0/ipcidr/24",
"/ip4/203.0.113.0/ipcidr/24",
"/ip4/240.0.0.0/ipcidr/4",
"/ip6/100::/ipcidr/64",
"/ip6/2001:2::/ipcidr/48",
"/ip6/2001:db8::/ipcidr/32",
"/ip6/fc00::/ipcidr/7",
"/ip6/fe80::/ipcidr/10"
],
"ConnMgr": {},
"DisableBandwidthMetrics": false,
"DisableNatPortMap": true,
"RelayClient": {},
"RelayService": {},
"ResourceMgr": {
"Limits": {},
"MaxMemory": "750MB"
},
"Transports": {
"Multiplexers": {},
"Network": {},
"Security": {}
}
}
} Bunch 'o profiles: https://dweb.link/ipfs/Qmays49dWQUXQzh7NgRFRDvrp5qT4nuCcSruAgGmUr539E |
I can't find the issues but I know about that indeed known. The accelerated DHT client and batch provider load you 22TB of CIDs in memory (slice of ptr to cid + len) and pass it to the accelerated DHT client. Note: The Announcement cost per CID goes down as you have more CID since acceleratedDHTclient batch them up. |
After some back of the envelope computation it should be 1~2 GiB of memory just in CID counting pointer and GC overhead. |
Will this memory be freed after loading or does it always stay in memory? |
IPFS with GOMEMLIMIT still within 1GB by the way. Also not swapping (much). |
it is freed after the providing process is finished. |
Just tried again with latest 0.21 release and accel. DHT. Even with My questions:
|
Golang is not a language that gives the required tools for programmers to correctly control memory usage, so trying to set a limit in the config wouldn't be very useful. |
Hello guys. Any update on this issue? |
For the issue that IIUC is being discussed here - memory consumption from loading a huge number of CIDs into memory then There are two options:
At the moment the maintainer's preference is to wait until we can do #10097 as that will end up being better for everyone and unlock more options for further reducing resource consumption. |
Checklist
Installation method
third-party binary
Version
Config
Description
Context
Starting kubo
ResourceMgr.MaxMemory
set to 1 or 2 GB.Consistently results in a OOM kill within minutes of startup.
Observations
When the node is started;
With
AcceleratedDHTClient
set tofalse
, this problem does not occur. Hence, it seems that the Accelerated DHT client does not adhere to the resource manager's memory limit. Perhaps the defaults just need to be made accelerated DHT aware.Might want to fix this before taking it out of the experimental phase. ;)
Possibly related/relevant
The text was updated successfully, but these errors were encountered: