-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Force notifications which touch GUI elements onto main thread #504
Conversation
…lements to post on main thread
One thing that worries me about the recent issues with the keep alive at remote institutions is that it appears to be happening primarily at run starts. The issue is that a modal dialog will block the main thread. If this happens during the run start procedure, it's possible that the detector will not have triggers enabled when this modal pops up. If that's the case we won't be live until the shifter notices and clicks the button which could be ~30 sec to 1 minute. Is the main goal of the keep alive signal to prevent the ellie systems from firing when Orca is not running? If that's the case, can it just automatically be restarted or is there some case in which you don't want it to be restarted? |
… with red log message. Also autorestart at start of ELLIE run sequence
Yes, that is the main goal. I have removed the pop-up and replaced with auto-restarts at the beginning of ELLIE flash sequences. I also added a red warning message so users are aware they can go attempt a restart from the ELLIE gui if they'd like to debug an issue. |
I'll test on the teststand in the morning |
@@ -89,6 +89,7 @@ - (id) init { | |||
connection = [[RedisClient alloc] init]; // Connection must be allocated before port and host name are set | |||
portNumber = TUBII_DEFAULT_PORT; | |||
strHostName = [[NSString alloc]initWithUTF8String:TUBII_DEFAULT_IP]; | |||
[connection setTimeout:2]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the timeout is specified in milliseconds. Would you mind adding a comment to the top of setTimeout() to explain that in this PR too?
How did the testing go with this? Is it ready to go? |
Hey. I tested yesterday. It didn't solve the TUBii, orphan issue, but ran smoothly that aside. I am confident it solved the issue it was originally posted for (i.e. the run modal crashed). |
I should add, while testing on the detector I also stepped through commenting out sections of the code associated with the culprit commit (be17c8e). I commented out all additions in the SNOPModel, SNOPController, TUBiiModel and TUBii related changes to the ELLIEController.m before handing over to afternoon crew for ramping. Even with all of those changes removed I was still seeing TUBii orphans at run restarts. It's not clear at all what's happening here. All I can imagine is that some process has been added which affects the ordering of calls at run restart. Changing the TUBii client timeout to 2s and extending the time between ORCA->TUBii keep alive pulses from (0.5s to 5.0s) also had no effect. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a few minor comments.
@@ -314,7 +314,7 @@ -(IBAction)tellieGeneralFireAction:(id)sender | |||
|
|||
[tellieGeneralFireButton setEnabled:NO]; | |||
[tellieGeneralStopButton setEnabled:YES]; | |||
[[NSNotificationCenter defaultCenter] postNotificationName:ORTELLIERunStartNotification object:nil userInfo:[self guiFireSettings]]; | |||
[[NSNotificationCenter defaultCenter] postNotificationOnMainThreadWithName:ORTELLIERunStartNotification object:nil userInfo:[self guiFireSettings]]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are IBAction methods which means they are called when someone clicks a button right? If so, they are guaranteed to be on the main thread so you don't need to change this.
@@ -328,7 +328,7 @@ -(IBAction)tellieExpertFireAction:(id)sender | |||
} | |||
|
|||
[tellieExpertStopButton setEnabled:YES]; | |||
[[NSNotificationCenter defaultCenter] postNotificationName:ORTELLIERunStartNotification object:nil userInfo:[self guiFireSettings]]; | |||
[[NSNotificationCenter defaultCenter] postNotificationOnMainThreadWithName:ORTELLIERunStartNotification object:nil userInfo:[self guiFireSettings]]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here.
@@ -548,7 +548,7 @@ - (IBAction)amellieFireAction:(id)sender { | |||
|
|||
[amellieFireButton setEnabled:NO]; | |||
[amellieStopButton setEnabled:YES]; | |||
[[NSNotificationCenter defaultCenter] postNotificationName:ORAMELLIERunStartNotification object:nil userInfo:[self guiFireSettings]]; | |||
[[NSNotificationCenter defaultCenter] postNotificationOnMainThreadWithName:ORAMELLIERunStartNotification object:nil userInfo:[self guiFireSettings]]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here.
@@ -89,6 +89,9 @@ - (id) init { | |||
connection = [[RedisClient alloc] init]; // Connection must be allocated before port and host name are set | |||
portNumber = TUBII_DEFAULT_PORT; | |||
strHostName = [[NSString alloc]initWithUTF8String:TUBII_DEFAULT_IP]; | |||
// This is being extended in an attempt to resolve the issues with ORPHANs | |||
// from missing TUBii information at run rollovers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This didn't resolve the orphan issue right? If so, can you delete this comment. Not sure if we want to keep the timeout extended or not. Presumably that was what was causing the run modal issue to be raised, so it could still help for remote institutions, so maybe you can just update the comment to say that the latency from remote shift stations is big enough that we are increasing it in an attempt to prevent timeouts from remote shift stations.
@@ -141,6 +144,9 @@ - (id) initWithCoder:(NSCoder *)aCoder { | |||
|
|||
//Connection must be made before port and host name are set. | |||
connection = [[RedisClient alloc] initWithHostName:strHostName withPort:portNumber]; | |||
// This is being extended in an attempt to resolve the issues with ORPHANs | |||
// from missing TUBii information at run rollovers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here.
Thanks Tony. Those notifications which I've updated to be on the main thread call a void function, not an IBAction. It was separated from the IBAction in the SNOPController as they two need to load their TELLIE specific settings from different places and the logic required in a single function was quite messy. I kept the void function (which picks up the notification) in the SNOPController for a couple of reasons:
|
I'm not sure I understand. What do you mean that they call a void function? Are the three methods tellieGeneralFireAction, tellieExpertFireAction, and amellieFireAction called from methods in separate threads? |
The "ORTellieRunStartNotification" gets picked up, here: https://github.com/snoplus/orca/blob/master/Source/Experiments/SNOP/SNOPController.m#L353 Which calls: https://github.com/snoplus/orca/blob/master/Source/Experiments/SNOP/SNOPController.m#L1789 |
Ok, but since these three methods (tellieGeneralFireAction, tellieExpertFireAction, and amellieFireAction) are all IBAction's and are only called when the user hits a button they are guaranteed to be on the main thread, so when you post the notification it's also guaranteed to be on the main thread, so you don't need to explicitly change the notifications to only post on the main thread. |
Ahh, OK. I see what you're saying. I mis-understood your initial comment. I thought you were referring to the functions they were calling, not where they were called from. I've updated. |
merged. |
This PR is an attempt fix the CRUS crash observed recently at Sussex. Debugging by Martti suggests the issue was line 978 of the TUBii model. This should be resolved by posting the note which calls this method on the main thread.
It may be that #499 was also due to the *ellieFireAction methods in the ELLIEController not posting to the main thread. These notes are picked up in the SNOPController and the resulting method calls make minor GUI changes (activate / deactivate buttons). This PR also addresses those cases.