Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Amazon ActiveMQ drops connection for long running listeners #163

Open
austinbittinger-inmar opened this issue Oct 18, 2019 · 3 comments
Open
Labels
Await User Reply Waiting For a reply from the user.

Comments

@austinbittinger-inmar
Copy link

Hi there! My team currently uses ActiveMQ as a centralized queue between all of our microservices, and we've adopted this gem to interact with ActiveMQ. I've been scratching my head for a little while, as it seems like the connection inevitably drops after 20 minutes. Here's a simplified version of what we're using:

client = Stomp::Client.new(
  hosts: [
    login: ENV['STOMP_USERNAME'],
    passcode: ENV['STOMP_PASSCODE'],
    host: ENV['STOMP_HOST'],
    port: ENV['STOMP_PORT'],
    ssl: true
  ],
  connect_headers: {
    'client-id' => 'my-service',
    'accept-version' => '1.2',
    'host' => 'localhost'
  }
)

client.subscribe('queue/1', id: SecureRandom.uuid, ack: 'client') do |msg|
  Handler1.perform_async(msg)
  client.acknowledge(msg)
end

client.subscribe('queue/2', id: SecureRandom.uuid, ack: 'client') do |msg|
  Handler2.perform_async(msg)
  client.acknowledge(msg)
end

client.join

Running this locally, the client will trigger on_miscerr and reconnect after about 10 minutes, but against Amazon MQ, the connection will drop but the client does not attempt a reconnect.

With a custom logger logging every transaction here, the client receives an on_receive event, and then drops the connection. Do you have any suggestions for how I could go about debugging this issue? I've tried just about every parameter on the client, and logged everything I can. If it helps, this client is running within a Kubernetes pod, pointing at a VPC only configuration of Amazon ActiveMQ.

@gmallard
Copy link

Unlikely that this is a gem bug I think. You are operating in a complex network environment. And as with all networking apps, any thing can go wrong at any time.

Any chance of you showing me your logs ?

Make sure your custom logger emits all the information it possibly can. Including original exception ans stack trace data if possible.

In the logger, try things like:

# Log miscellaneous errors
  def on_miscerr(parms, errstr)
    begin
      @log.debug "Miscellaneous Error #{info(parms)}"
      @log.debug "Miscellaneous Error String #{errstr}"
      @log.debug "Miscellaneous Error All Parms #{parms.inspect}"      
			if parms[:ssl_exception]
		               @log.debug "SSL Miscellaneous Error Parms: #{parms[:ssl_exception]}"
		               @log.debug "SSL Miscellaneous Error Message: #{parms[:ssl_exception].message}"
				btr = parms[:ssl_execption].backtrace.join("\n")
				@log.debug "Backtrace SME: #{btr}"
			end
    rescue
      @log.debug "Miscellaneous Error oops"
    end
  end

Do you have access to AMQ logs ? If so, do they show anything "interesting" ?

Looking at your connect hash: have you tried using well selected values for heartbeats ? That is a shot in the dark, but it might help.

@gmallard
Copy link

I changed your code above just enough to get it running here. Started it.

Also started two producers. One sends to queue 1 every 30 seconds, the other to queue 2 every 20 seconds.

Connections to AMQ on localhost.

Right now, that has been running for about 20 hours with no failures.

I doubt tat I will be able to recreate this problem.

@gmallard
Copy link

I cannot recreate the problem you describe.

I have had your code running for as long as 4 days, with no problems.

If you need help from me I am going to need to see all of the detail in logs from the logger.

Incidentally, there is an enhancement to the example logger the gem provides. It is on the DEV branch only at present.

@gmallard gmallard added the Await User Reply Waiting For a reply from the user. label Nov 21, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Await User Reply Waiting For a reply from the user.
Projects
None yet
Development

No branches or pull requests

2 participants