Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SEGV when python code is pycall-ed from Puma (works without puma) #185

Closed
snickell opened this issue Aug 6, 2024 · 4 comments
Closed

SEGV when python code is pycall-ed from Puma (works without puma) #185

snickell opened this issue Aug 6, 2024 · 4 comments

Comments

@snickell
Copy link

snickell commented Aug 6, 2024

We are hitting an issue where we have code that works from simple irb, or from rails console, but crashes when run from a Rails controller. I have reduced our issue to a minimal repro that uses only Puma. I'm comfortable with C and C-debugging tools if that helps, I'm just trying to figure out where to start.

Minimal repro (tried to make it very simple): https://github.com/snickell/pycall_puma_crash

We'd like to use pycall.rb for code.org (github). Its a very clever approach, thank you @mrkn 🙇

@snickell
Copy link
Author

snickell commented Aug 6, 2024

I suspect this is the same issue as #175, but unlike that issue I have reduced it to only Puma, PyCall, and one python module. I will work to find a simpler python module, because llmguard is complicated. It would be great to find a repro that only uses pandas (hinted at by #175).

@snickell snickell changed the title SEGV when python code is pycall-ed from Puma (same code works without puma) SEGV when python code is pycall-ed from Puma (works without puma) Aug 6, 2024
@snickell
Copy link
Author

snickell commented Aug 6, 2024

I've got a pandas-only repro now, updating main comment to match.

@snickell
Copy link
Author

snickell commented Aug 6, 2024

Aha! Even when you set threads=1, puma still spawns a different thread for requests than ran the initial code. This will affect Rails users as well:

# Setup our local venv (using pdm, in .venv)
ENV['PYTHON'] = `pdm run which python`.strip
site_dir = `pdm run python -c 'import site; print(site.getsitepackages()[0])'`.strip

require 'pycall'
$pycall_thread_id = Thread.current.object_id

# This is to setup our local venv
site = PyCall.import_module('site')
site.addsitedir(site_dir)

module CrashPuma

  def self.do_crash
    raise "Thread IDs did not match: started with thread #{$pycall_thread_id}, but request is on thread #{Thread.current.object_id}" if $pycall_thread_id != Thread.current.object_id
    # => "Thread IDs did not match...."

    puts "About to crash (if running in puma)"

    pandas = PyCall.import_module('pandas')
    data = pandas.read_csv('https://archive.ics.uci.edu/ml/machine-learning-databases/wine-quality/winequality-red.csv', sep: ';')
    puts data.head()

    puts "IT DID NOT CRASH"
  end
end

Conclusion: there may not be a safe way to use pycall from puma-using servers, including a default rails configuration. Puma always starts a thread even if threads=1.

@snickell
Copy link
Author

snickell commented Aug 8, 2024

See: #96

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant