Skip to content

fix: timeout in tform #672

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion check/features.frm
Original file line number Diff line number Diff line change
Expand Up @@ -306,7 +306,9 @@ assert succeeded?
# ParFORM may terminate without printing the error message,
# depending on the MPI environment.
#pend_if mpi?
assert runtime_error?
# Sometimes, FORM will terminate after 1s without a runtime error.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stopping without printing a runtime error is a bug. Maybe we can put a "TODO" comment here?

# TODO: this should be considered a bug.
assert succeeded? || runtime_error?
*--#] TimeoutAfter_2 :
*--#[ dedup :
* Test deduplication
Expand Down
23 changes: 23 additions & 0 deletions sources/threads.c
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,12 @@
*/

#include "form3.h"

#ifdef WITH_ALARM
// This is only required if we are blocking SIG_ALRM in the worker threads.
#include <signal.h>
#endif

#ifdef WITHFLOAT
#include <gmp.h>

Expand Down Expand Up @@ -289,6 +295,17 @@ int StartAllThreads(int number)
numberofworkers = number - 1;
threadpointers[identity] = pthread_self();
topofavailables = 0;

#ifdef WITH_ALARM
/* During thread creation, we block SIGALRM on the main thread. The created
threads will inherit this. This is required for #timeout to work properly
in TFORM: only the main thread should recieve SIGALRM. */
sigset_t sig_set;
sigemptyset(&sig_set);
sigaddset(&sig_set, SIGALRM);
pthread_sigmask(SIG_BLOCK, &sig_set, NULL);
#endif

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need such code also for RunSortBot?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, here the signal is only unblocked below after we also have started the sort bots.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thing: WITH_ALARM should be checked. Otherwise, the TFORM build fails on Windows.

for ( j = 1; j < number; j++ ) {
if ( pthread_create(&thethread,NULL,RunThread,(void *)(&dummy)) )
goto failure;
Expand Down Expand Up @@ -330,6 +347,12 @@ int StartAllThreads(int number)
IniSortBlocks(number-1);
AS.MasterSort = 0;
AM.storefilelock = dummylock;

#ifdef WITH_ALARM
/* Now we allow the main thread to recieve SIGALRM again. */
pthread_sigmask(SIG_UNBLOCK, &sig_set, NULL);
#endif

/*
MesPrint("AB = %x %x %x %d",AB[0],AB[1],AB[2], identityofthreads);
*/
Expand Down
Loading