Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add test simulating unreachable socket for reporter #82

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

justinjc
Copy link
Collaborator

@justinjc justinjc commented Oct 1, 2020

This test sets up:

  1. A reporter that reports every second.
  2. A UDP listener that the reporter sends its metrics to.
  3. An emitter that constantly emits a unique timer. A timer is used because timers are directly reported, whereas counters and gauges have their own buffering.

The scenario:

  1. The emitter is allowed to emit for five seconds.
  2. The socket backing the UDP listener is closed, simulating the reporter's communicating endpoint being unreachable for some reason.
  3. The emitter is allowed to proceed for ten more seconds.
  4. Output a thread dump to see the state of the world.

Observations when run on git ref:

  • 978287b: timers are recorded, then after the socket is closed, timers continue to be recorded.

    • Travis run (search SocketCloseTest)
    • Thread dump of emitter thread:
      "EMITTER THREAD" Id=44 TIMED_WAITING on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@5181cf78
          at sun.misc.Unsafe.park(Native Method)
          -  waiting on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@5181cf78
          at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
          at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093)
          at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
          at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
          ...
      
  • 3280651: timers are recorded, then after the socket is closed, timers are blocked from being recorded.

    • Travis run (search SocketCloseTest)
    • Thread dump of emitter thread:
      "EMITTER THREAD" Id=44 WAITING on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@3dac736c
          at sun.misc.Unsafe.park(Native Method)
          -  waiting on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@3dac736c
          at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
          at java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:350)
          at com.uber.m3.tally.m3.M3Reporter.queueSizedMetric(M3Reporter.java:445)
          at com.uber.m3.tally.m3.M3Reporter.reportTimer(M3Reporter.java:317)
          at com.uber.m3.tally.TimerImpl.record(TimerImpl.java:55)
          at com.uber.m3.tally.m3.SocketCloseTest$MetricsEmitter.run(SocketCloseTest.java:116)
          ...
      
  • 0589dcc (the latest 0.9.0 release): timers are recorded, then after the socket is closed, timers continue to be recorded.

    • Travis run (search SocketCloseTest)
    • Thread dump of emitter thread:
      "EMITTER THREAD" Id=55 TIMED_WAITING on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@56882ba4
          at sun.misc.Unsafe.park(Native Method)
          -  waiting on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@56882ba4
          at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
          at java.util.concurrent.LinkedBlockingQueue.offer(LinkedBlockingQueue.java:385)
          at com.uber.m3.tally.m3.M3Reporter.enqueue(M3Reporter.java:465)
          at com.uber.m3.tally.m3.M3Reporter.reportTimer(M3Reporter.java:354)
          at com.uber.m3.tally.TimerImpl.record(TimerImpl.java:55)
          at com.uber.m3.tally.m3.SocketCloseTest$MetricsEmitter.run(SocketCloseTest.java:115)
          ...
      

@justinjc justinjc requested review from prateek and vdarulis October 1, 2020 04:48
@justinjc justinjc changed the title [WIP] Add SocketCloseTest [WIP] Add test simulating unreachable socket for reporter Oct 1, 2020
@alexeykudinkin
Copy link
Contributor

alexeykudinkin commented Oct 1, 2020

@justinjc your test isn't simulating conditions properly: Processor thread is continues to run in the thread dump of your first test-run:

   "pool-9-thread-1" Id=28 TIMED_WAITING on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@29b2c15a
    	at sun.misc.Unsafe.park(Native Method)
    	-  waiting on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@29b2c15a
    	at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
    	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
    	at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
    	at com.uber.m3.tally.m3.M3Reporter$Processor.run(M3Reporter.java:485)
    	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    	at java.lang.Thread.run(Thread.java:748)

@prateek
Copy link
Collaborator

prateek commented Oct 1, 2020

The same thread you're pointing out continues to run in the second test as well:

"pool-9-thread-1" Id=28 TIMED_WAITING on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@29b2c15a
    	at sun.misc.Unsafe.park(Native Method)
    	-  waiting on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@29b2c15a
    	at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
    	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
    	at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
    	at com.uber.m3.tally.m3.M3Reporter$Processor.run(M3Reporter.java:485)
    	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    	at java.lang.Thread.run(Thread.java:748)
    	Number of locked synchronizers = 1
    	- java.util.concurrent.ThreadPoolExecutor$Worker@a4ca1d1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants