Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scan documentation update and stop scan bug fix #513

Merged
merged 5 commits into from
Jun 8, 2017
Merged

Conversation

cupakromer
Copy link
Contributor

@cupakromer cupakromer commented May 24, 2017

This is a follow up to #507

Bug Fix

  • Fix where scanner thread is quit

    This fixes what I believe was a simply copy-paste error regarding how the scanner thread is quit. This uses the scanner thread handler, instead of the main thread handler, to schedule the quit operation. This way any queued scan jobs (such as a stop) complete before we quit the thread. This is what the comment states, but wasn't what the code implemented.

Behavior Modification

Fix scan stop when bound but no monitored / ranged regions.

This adds a guard to ensure that we only attempt any further scans when the scan cycler is started. This helps ensure the cycler has terminated scanning for the case where all regions have been removed but the service is still bound. In this case only stop will be called (instead of both stop and destroy). When this happens there are two potential scheduled tasks still queued:

  • scanLeDevice(true) from deferScanIfNeeded
  • scheduleScanCycleStop

Which potential task is queued is based on if we were in an active scan period or a between scan period when stop was called. In the case of a between scan period the deferScanIfNeeded will cause a faux scanning cycle to start up again when it ends (scanning won't actually start, but all flags and internal state appear as if it is running). It will run for one more scan cycle before it terminates in finishScanCycle. Similarly, if we are in an active scan period the scheduled job for finishing the scan cycle will continue for the duration of the scan period.

Documentation / Annotations

This helps to try explaining how / when the various internal handlers should be used. This is important to help developers understand various threading flow orders, race conditions, and resource contention problems. As part of the documentation this adds MainThread and WorkerThread annotations to the deferred Runnable tasks' run methods.

This makes the current implicit assumption that the scan cycler is started/stopped on the main thread explicit. This helps quickly inform developers what the callback expectations are in regards to race conditions and state safety.

In order to verify this we need to trace the start/stop/destroy calls back to the service. The service's core methods onCreate and onDestroy are called on the main thread (see links below). This means the service creates it's mMessenger on the main thread, which creates the IncomingHandler on the main thread which binds to the main looper. However, the IncomingHandler class itself leaves the looper unspecified relying on looper for the thread which creates it.

As we did in #430, this modifies IncomingHandler to explicitly use the main looper as a future proofing mechanism. This way we can ensure this implicit expectation doesn't change or at least make it obvious when troubleshooting cases where someone expects it to have changed. Similarly, this adds the main thread annotation to it's handleMessage implementation.

Working from here we can confirm that the only places where the beacon monitoring/ranging is updated is from the main thread through the IncomingHandler. As the IncomingHandler is the main communication channel for the service we transfer the main thread expectation to the associated ranging/monitoring methods. If someone decides to call these methods directly on the service these annotations help the IDEs check/document that such calls are expected from the main thread.

With all of that documented we can now confidently state that the scan cycler's start, stop, and destroy are also meant to be called from the main thread. As start and stop both call scanLeDevice we can start tracing any other threading expectations for it. We already know it's called from the main thread through deferred jobs. This leaves the finishScanCycle as the last call site.

finishScanCycle is only called from scheduleScanCycleStop. As this method name implies it's called through a deferred job on the main thread. It is also called the "first" time in scanLeDevice. Thus we've shown that scanLeDevice, finishScanCycle, and scheduleScanCycleStop are expected to run on the main thread.

See Also

Copy link
Member

@davidgyoung davidgyoung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks right. Is there a test you have run that indicates the stop scan did not work as written before? I know it worked in my tests, despite not being written as intended, but it may have just been due to timing issues.

@@ -329,6 +339,7 @@ private PendingIntent getRestartIntent() {
/**
* methods for clients
*/
@MainThread
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this annotation intended to be used this way? The docs say it is supposed to indicate that the method should only be called on the main thread, and I don't see why that would be true in this case. It seems to be intended for methods that manipulate the UI. In this case, I don't think there is anything wrong from calling on a worker thread. Perhaps a comment would be a better way to indicate that it is not necessarily run on a worker thread.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scanner classes are not thread safe. They are started / run from the beacon service which runs from the main thread. Additionally, in the scanner class mHandler is defined on the main looper. One of the deferred tasks which are queued on this handler call scanLeDevice. As things are right now, for all intents and purposes these methods need to run on the main thread.

While we could add this to the javadoc, the annotation documents this with the benefit of giving an extra hint to intellisense. Now if you write the following you'll get a lint warning/error depending on the IDE configuration:

mScanHandler.post(new Runnable() {
    @WorkerThread
    @Override
    public void run() {
        scanLeDevice(true);
    }
}

protected void scanLeDevice(final Boolean enable) {
try {
mScanCyclerStarted = true;
if (getBluetoothAdapter() == null) {
LogManager.e(TAG, "No Bluetooth adapter. beaconService cannot scan.");
}
if (enable) {
if (enable && mScanningEnabled) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not 100% sure why we would not want to execute this block if mScanningEnabled is false

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would we want to allow a scan to begin after stop has already been called?

mScanningEnabled is the main "is this cycled scanner started or stopped" flag. It is only set in start and stop.

@cupakromer
Copy link
Contributor Author

I've attached some detailed method / state traces for the lollipop scanner as follows:

Code Stop on State Log File
Before PR Stop during between scan period file: pre_pr_between_stop.txt
Before PR Stop during active scan period file: pre_pr_scan_stop.txt
PR Stop during between scan period file: pr_between_stop.txt
PR Stop during active scan period file: pr_scan_stop.txt

@davidgyoung
Copy link
Member

What difference should I see in these traces? For both the Before PR and PR files I see that scanning stops after STOP REQUESTED is annotated in the log, and there are no further scan starts.

This helps to try explaining how / when the various internal handlers
should be used. This is important to help developers understand various
threading flow orders, race conditions, and resource contention
problems. As part of the documentation this adds `MainThread` and
`WorkerThread` annotations to the deferred `Runnable` tasks' `run`
methods.

See Also:

- https://developer.android.com/studio/write/annotations.html#thread-annotations
This makes the current implicit assumption that the scan cycler is
started/stopped on the main thread explicit. This helps quickly inform
developers what the callback expectations are in regards to race
conditions and state safety.

In order to verify this we need to trace the `start`/`stop`/`destroy`
calls back to the service. The service's core methods `onCreate` and
`onDestroy` are called on the main thread (see links below). This means
the service creates it's `mMessenger` on the main thread, which creates
the `IncomingHandler` on the main thread which binds to the main looper.
However, the `IncomingHandler` class itself leaves the looper
unspecified relying on looper for the thread which creates it.

As we did in #430, this modifies `IncomingHandler` to explicitly use the
main looper as a future proofing mechanism. This way we can ensure this
implicit expectation doesn't change or at least make it obvious when
troubleshooting cases where someone expects it to have changed.
Similarly, this adds the main thread annotation to it's `handleMessage`
implementation.

Working from here we can confirm that the only places where the beacon
monitoring/ranging is updated is from the main thread through the
`IncomingHandler`. As the `IncomingHandler` is the main communication
channel for the service we transfer the main thread expectation to the
associated ranging/monitoring methods. _If_ someone decides to call
these methods directly on the service these annotations help the IDEs
check/document that such calls are expected from the main thread.

With all of that documented we can now confidently state that the scan
cycler's `start`, `stop`, and `destroy` are also meant to be called from
the main thread. As `start` and `stop` both call `scanLeDevice` we can
start tracing any other threading expectations for it. We already know
it's called from the main thread through deferred jobs. This leaves the
`finishScanCycle` as the last call site.

`finishScanCycle` is only called from `scheduleScanCycleStop`. As this
method name implies it's called through a deferred job on the main
thread. It is also called the "first" time in `scanLeDevice`. Thus we've
shown that `scanLeDevice`, `finishScanCycle`, and
`scheduleScanCycleStop` are expected to run on the main thread.

See Also:

- https://developer.android.com/reference/android/os/Handler.html#Handler%28%29
- https://developer.android.com/training/multiple-threads/communicate-ui.html
- https://developer.android.com/reference/android/app/Service.html
- https://developer.android.com/guide/components/services.html
- #430
This fixes what I believe was a simply copy-paste error regarding how
the scanner thread is quit. This uses the scanner thread handler,
instead of the main thread handler, to schedule the `quit` operation.
This way any queued scan jobs (such as a stop) complete before we quit
the thread. This is what the comment states, but wasn't what the code
implemented.

The `mHandler.removeCallbacksAndMessages` is removed from the background
job as `destroy` is already on the main thread. So we want to clear any
other jobs still in the queue to run later now as we are in the process
of destroying everything.
This adds a guard to ensure that we only attempt any further scans when
the scan cycler is started. This helps ensure the cycler has terminated
scanning for the case where all regions have been removed but the
service is still bound. In this case only `stop` will be called (instead
of both `stop` and `destroy`). When this happens there are two potential
scheduled tasks still queued:

- `scanLeDevice(true)` from `deferScanIfNeeded`
- `scheduleScanCycleStop`

Which potential task is queued is based on if we were in an active scan
period or a between scan period when `stop` was called. In the case of a
between scan period the `deferScanIfNeeded` _will_ cause a faux scanning
cycle to start up again when it ends (scanning won't actually start, but
all flags and internal state appear as if it is running). It will run
for one more scan cycle before it terminates in `finishScanCycle`.
Similarly, if we are in an active scan period the scheduled job for
finishing the scan cycle will continue for the duration of the scan
period.
Through testing it's been confirmed that the bluetooth scan callbacks
are always run on the main thread. This chases that through the scanners
and our callbacks so that this is properly documented / annotated.

During this process the unused `AsyncTask` methods were removed as we
don not use them. Additionally, the `mDistinctPacketsDetectedPerScan`
flag has been declared `volatile` to ensure reads always see the most
recently written value without having to rely on a heavy weight
`synchronized` block. As this is a flag we only perform simple reads /
writes. While we do perform a multi-step operation around
reading/writing this value the work performed is too heavy weight and
long to wrap it in a `synchronized` block. Using `volatile` gives
multipel concurrent background threads a better chance of aborting this
logic if another has changed the value. However, the downside of not
having complete synchronization is that a background thread will try to
check a few more packets and then re-set the flag to `true`.
@cupakromer
Copy link
Contributor Author

I've attached a new set of logs with more detailed annotations. I've also tried to better summarize the behavior in the chart here.

Code Cycle State Log Behavior Log Files
Pre Between Scanning stops but the between cycle continues, then scan cycle starts (but doesn't actually scan) though the flags state otherwise, then once the scan period is over things stops. between-beacons.txt between-no-beacons.txt
Pre Scan Scanning stops but the scan cycle continues until the end of the period. scan-beacons.txt scan-no-beacons.txt
PR Between Scanning stops and a single final deferScanIfNeeded is run. pr-between-beacons.txt pr-between-no-beacons.txt
PR Scan Scanning stops and a single final scheduleScanCycleStop is run. pr-scan-beacons.txt pr-scan-no-beacons.txt

Essentially this PR stops extra unnecessary work being performed by background tasks.

@cupakromer
Copy link
Contributor Author

@davidgyoung I've added more annotated versions of the log files. I think these help demonstrate the behavior a bit better.

@davidgyoung
Copy link
Member

Thanks for those tests, @cupakromer, those are helpful. I'm going to do some more testing with these changes combined with #484 before merging, as that provides a good test case for starting and stopping CycledLeScanner.

@cupakromer
Copy link
Contributor Author

Is this good to merge to master or are there some changes I should make?

@davidgyoung
Copy link
Member

Sorry, I haven't been able to run my tests yet on Android O because I had trouble with the build on my test device. I have the testing environment working now, so I will try to get the testing done tomorrow.

@davidgyoung
Copy link
Member

Did a bunch of tests yesterday on both Android 6.0 and Android O with this PR merged into the scheduled-job-scanning branch.

I started and stopped the CycledLeScanner 12+ on each platform times shifting from background to foreground and scanning stopped appropriately and the thread was killed. I also ran 32 10-second background scan cycles overnight on Android O and again saw scanning end appropriately.

@davidgyoung davidgyoung merged commit 9902ef7 into master Jun 8, 2017
@cupakromer cupakromer deleted the scan-stop-notes branch June 20, 2017 14:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants