Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange behavior under linux #60

Open
Valtosin opened this issue Oct 14, 2022 · 12 comments
Open

Strange behavior under linux #60

Valtosin opened this issue Oct 14, 2022 · 12 comments

Comments

@Valtosin
Copy link

Hi, it's Ubuntu 20.04 here, let's start.

Everything works as it should until the moment when DreamDaemon goes into deadlock. Spontaneously, even if there is no mapload happens, however loading of maps (I mean creation of new z-levels or changing turfs to space and back) can speed up encounter of this.

Hangup occurs without any runtimes and crashes. Nothing happens at all, the server just freezes tightly with 0 cpu load and same memory usage (about 1500 megs, but sometimes more, sometimes less). SIGUSR2 also doesn't work, but it should display backtrace and other info into DD output, what means it is completely dead. So, SIGBUS on DD works though and all I could get in log from it was this information:

Backtrace after loading 15 gateways:

BUG: Crashing due to an illegal operation!
proc name: New (/datum/gas_mixture/New)
  source file: gas_mixture.dm,27
  usr: null
  src: /datum/gas_mixture (/datum/gas_mixture)
  call stack:
/datum/gas_mixture (/datum/gas_mixture): New(0)
/datum/pipeline (/datum/pipeline): reconcile air()
/datum/pipeline (/datum/pipeline): process()
Гипер-Атмос (/datum/controller/subsystem/air): process pipenets(0)
Гипер-Атмос (/datum/controller/subsystem/air): fire(0)
Гипер-Атмос (/datum/controller/subsystem/air): ignite(0)
Master (/datum/controller/master): RunQueue()
Master (/datum/controller/master): Loop(2)
Master (/datum/controller/master): StartProcessing(0)

Backtrace for BYOND 514.1589 on Linux:
Generated at Fri Oct 14 10:58:14 2022

DreamDaemon [0x8048000, 0x0], [0x8048000, 0x804bd94]
linux-gate.so.1 [0xf7f30000, 0xf7f30b40], [0xf7f30000, 0xf7f30b49]
linux-gate.so.1 [0xf7f30000, 0xf7f30b70], [0xf7f30000, 0xf7f30b70]
linux-gate.so.1 [0xf7f30000, 0xf7f30b40], [0xf7f30000, 0xf7f30b49]
libc.so.6 0xffe70, 0xffe9b (syscall)
libauxmos.so [0xf4f79000, 0x0], 0x1426b
libauxmos.so [0xf4f79000, 0x0], 0x58077
libauxmos.so 0x73fd0, 0x742d9
libauxmos.so [0xf4f79000, 0x0], 0xc3bc7
libauxlua.so [0xc79fe000, 0x0], 0x121256
libbyond.so [0xf7895000, 0x0], 0x332644
libbyond.so [0xf7895000, 0x0], 0x343559
libbyond.so [0xf7895000, 0x0], 0x313903
libbyond.so [0xf7895000, 0x0], 0x330412
libauxmos.so [0xf4f79000, 0x0], 0xc3c43
libauxlua.so [0xc79fe000, 0x0], 0x121256
libbyond.so [0xf7895000, 0x0], 0x332644
libbyond.so [0xf7895000, 0x0], 0x343e3a
libbyond.so [0xf7895000, 0x0], 0x318b4d
libbyond.so [0xf7895000, 0x0], 0x330412
libauxmos.so [0xf4f79000, 0x0], 0xc3c43
libauxlua.so [0xc79fe000, 0x0], 0x121256
libbyond.so [0xf7895000, 0x0], 0x332644
libbyond.so [0xf7895000, 0x0], 0x343559
libbyond.so [0xf7895000, 0x0], 0x313903
libbyond.so [0xf7895000, 0x0], 0x330412
libauxmos.so [0xf4f79000, 0x0], 0xc3c43
libauxlua.so [0xc79fe000, 0x0], 0x121256
libbyond.so [0xf7895000, 0x0], 0x332644
libbyond.so [0xf7895000, 0x0], 0x343559
libbyond.so [0xf7895000, 0x0], 0x313903
libbyond.so [0xf7895000, 0x0], 0x330412
libauxmos.so [0xf4f79000, 0x0], 0xc3c43
libauxlua.so [0xc79fe000, 0x0], 0x121256
libbyond.so [0xf7895000, 0x0], 0x332644
libbyond.so [0xf7895000, 0x0], 0x343559
libbyond.so [0xf7895000, 0x0], 0x313903
libbyond.so [0xf7895000, 0x0], 0x330412
libauxmos.so [0xf4f79000, 0x0], 0xc3c43
libauxlua.so [0xc79fe000, 0x0], 0x121256
libbyond.so [0xf7895000, 0x0], 0x332644
libbyond.so [0xf7895000, 0x0], 0x343559
libbyond.so [0xf7895000, 0x0], 0x313903
libbyond.so [0xf7895000, 0x0], 0x330412
libauxmos.so [0xf4f79000, 0x0], 0xc3c43
libauxlua.so [0xc79fe000, 0x0], 0x121256
libbyond.so [0xf7895000, 0x0], 0x332644
libbyond.so [0xf7895000, 0x0], 0x343559
libbyond.so [0xf7895000, 0x0], 0x315c47
libbyond.so [0xf7895000, 0x0], 0x330412

Recent proc calls:
/proc/__detect_auxtools
/proc/__detect_auxtools
/proc/__detect_auxtools
/proc/__detect_auxtools
/datum/gas_mixture/New
/datum/pipeline/proc/return_air
/datum/pipeline/proc/reconcile_air
/datum/pipeline/process
/proc/__detect_auxtools
/proc/__detect_auxtools
/proc/__detect_auxtools
/proc/__detect_auxtools
/datum/gas_mixture/New
/datum/pipeline/proc/return_air
/datum/pipeline/proc/reconcile_air
/datum/pipeline/process

Another backtrace log looks same and blames on /datum/gas_mixture/New too, but that happened without gateway spam loading, just during normal round.

Yes, I tried to use different auxmos revisions after 2.0.0 on 514.1575, 1583 and 1585 with some modifications, but with similar result as above.

Maybe it's also because we don't use the reactions hook, only "katmos", but the old version on the commit 91708eb5cf289f2176630ab168d7d4f9da830523 works without any hangups on 514.1575. On version 514.1589 works the same, though after adding crutches to the old library, maybe it will be useful for someone: https://github.com/frosty-dev/auxmos/tree/assblast

Since I could not find a working server on the latest version, I decided to use the code from several repositories using this library, for example:
yogstation13/Yogstation#13479;
Citadel-Station-13/Citadel-Station-13#15864;
https://github.com/BeeStation/BeeStation-Hornet - we use supercruise from them since extools which creates funny crashtest for atmos;

Our repo with latest auxmos is: https://github.com/frosty-dev/white/tree/0945d9327a82e160cab7a34f1f6336a652916f0a
Our auxmos (changed auxtools repo to mothblocks pr for 1588+ sigs compatibility): https://github.com/frosty-dev/auxmos

I don’t know, maybe we have too shitty code, but still I thought it right to report at least something, maybe this will help you.

Any auxmos revisions works as expected on Windows, tbh.

@jupyterkat
Copy link
Collaborator

jupyterkat commented Nov 9, 2022

it may be caused by you loading maps on initialized turfs without setting the turfs to sleep. are you sure this is a linux-only bug?

you'll need to get the stack trace by attaching gdb to dreamdaemon and pause whenever it deadlocks. i can't help much without the actual stack trace (also need the stack trace on all the active threads, not just one).

@Putnam3145
Copy link
Owner

Could also be __detect_auxtools still sucking for linux

@Valtosin
Copy link
Author

Valtosin commented Jan 1, 2023

@Valtosin
Copy link
Author

Valtosin commented Jan 1, 2023

probably related to this Amanieu/parking_lot#212

@jupyterkat
Copy link
Collaborator

i know that's the issue, but your backtrace provides me with nothing useful, i said backtrace on all the threads?

@Valtosin
Copy link
Author

Valtosin commented Jan 2, 2023

sorry, here is it
gdb.txt

@out-of-phaze
Copy link
Contributor

out-of-phaze commented Jan 3, 2023

@Valtosin You might want to try this version, @Lohikar modified it to use a channel instead of a vector. Worked fine in local testing but I've not been able to reproduce this issue myself, being primarily on Windows (and it doesn't happen in WSL2 for me).

https://github.com/out-of-phaze/auxmos/releases/tag/v2.3.0-channel

https://github.com/out-of-phaze/auxmos/releases/tag/v2.3.0-nonthreaded-resize
Try this one.

@Valtosin
Copy link
Author

Valtosin commented Jan 5, 2023

https://github.com/out-of-phaze/auxmos/releases/tag/v2.3.0-nonthreaded-resize
Try this one.

This one works perfectly, no runtimes about atmos and no deadlocks. Neat. I forgot to check if atmos works at all, but I hope it does.

dreamseeker_2023-01-05_08-08-21

@out-of-phaze
Copy link
Contributor

I tested it locally, gases moved properly, burned properly, and I didn't suffocate to death on spawning in. Planetary atmos also appeared to work.

@Valtosin
Copy link
Author

Valtosin commented Jan 5, 2023

So, now it crashes on prod just because after a while, when this runtime appears:

[20:57:36] Runtime in spaceman_dmm.dm,32: Could not read Value(1, 18657).air
  proc name: auxtools stack trace (/proc/auxtools_stack_trace)
  src: null
  call stack:
  auxtools stack trace("Could not read Value(1, 18657)...")
  Auxtools Callbacks (/datum/controller/subsystem/callbacks): fire(0)
  Auxtools Callbacks (/datum/controller/subsystem/callbacks): ignite(0)
  Master (/datum/controller/master): RunQueue()
  Master (/datum/controller/master): Loop(2)
  Master (/datum/controller/master): StartProcessing(0)

I'll try to reproduce this also.

@out-of-phaze
Copy link
Contributor

Right, entirely possible that I botched the modification in some subtly broken way. Literally all I did was move it out of a task and into the main thread, so maybe some references held by other threads are getting lost in the shuffle when the gas mixtures vector resizes?

@Valtosin
Copy link
Author

So, no crashes for about two weeks. I think it's probably fixed for linux?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants