During run = SIGSEGV, segmentation fault occurred #6538
Replies: 2 comments 5 replies
-
From just a quick look at the trace, it's coming from a parallel I/O for the ice component, so I'd start by looking at how you've configured IO tasks. Also, you may be spreading the ice a bit thin for this problem size on that many nodes. Someone else might chime in with other ideas... |
Beta Was this translation helpful? Give feedback.
-
Hi Phil, Thank you very much for your prompt reply. Would you mind sharing how I would go about configuring the IO tasks for the model components? Also, when I ran run.v3.LR.historical_0101_b2000.pm-cpu.sh, with the custom-2_1x1_ndays (just like the tutorial suggestions), I got this error: Seems to also be about parallel I/O -- I'm not sure if it's related to the other error but may be it's a helpful message to you. |
Beta Was this translation helpful? Give feedback.
-
Hello,
I don't believe you support use on non-supported machines. But, I've been trying to run E3SMv2.1 and E3SMv3 on my local machine at Indiana University and cannot get passed the below error. I think it has to do with my mpi library or perhaps some issues with the modules I have loaded - though I'm just guessing. If anyone can help, it would be greatly appreciated.
To describe my problem, I had successful runs of I20TRELM using 1, 4, 8 nodes (128 procs/node) but when I use 16 nodes, I get a similar error message to the one below. I then tried to run Tutorial 2024 Day 1's practicum, but when it came to running the script, run.v3.LR.historical_0101_b2000.pm-cpu.sh, with the custom-2_1x1_ndays, I received errors. First the e3sm error log said "out of memory", so I tried increasing the nodes to custom-4. Then I got the error below which is similar to the error I had previously with I20TRELM.
If anyone wants to be a point of contact, that would be great and I can send you more information about my machine and the config.xml files. Thanks! Paul
Beta Was this translation helpful? Give feedback.
All reactions