Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't execute bbr commands via the clusters #716

Open
LeoLuongVuong opened this issue Aug 30, 2024 · 9 comments
Open

Can't execute bbr commands via the clusters #716

LeoLuongVuong opened this issue Aug 30, 2024 · 9 comments

Comments

@LeoLuongVuong
Copy link

Hi
I'm trying to execute submit my NONMEM models with bbr in my clusters but I couldn't. I constantly got the following error. Can someone help me out here please? Many thanks.

submit_model(
mod1,
.mode = "local",
.bbi_args = list(parallel = TRUE, threads = 4, overwrite = TRUE) # not needed if set in bbi.yaml
)
Error in check_status_code(p$get_exit_status(), output, .cmd_args) :

bbi nonmem run local /vsc-hard-mounts/leuven-data/357/vsc35700/Posa_Ped_IPDMA_MLV/nonmem_modelling/1.mod --parallel --threads=4 --overwrite returned status code 1 -- STDOUT and STDERR:
time="2024-08-30T16:03:03+02:00" level=info msg="Successfully loaded default configuration from /vsc-hard-mounts/leuven-data/357/vsc35700/Posa_Ped_IPDMA_MLV/nonmem_modelling/bbi.yaml"
time="2024-08-30T16:03:03+02:00" level=info msg="Beginning Local Path"
time="2024-08-30T16:03:03+02:00" level=info msg="A total of 1 models have completed the initial preparation phase"
time="2024-08-30T16:03:03+02:00" level=info msg="[1] Beginning local work phase"
time="2024-08-30T16:03:13+02:00" level=error msg="[1] Exit code was 115, details were exit status 115"
time="2024-08-30T16:03:13+02:00" level=error msg="[1] output details were: Starting NMTRAN\n \n WARNINGS AND ERRORS (IF ANY) FOR PROBLEM 1\n \n (WARNING 2) NM-TRAN INFERS THAT THE DATA ARE POPULATION.\n \n (WARNING 3)

@kylebaron
Copy link
Contributor

Hi @LeoLuongVuong -

Is there any additional information in the .lst file? You could cut out the model code and let us see if NONMEM is sending anything else back.

Also - can you tell us if / how the model runs with local execution?

Kyle

@LeoLuongVuong
Copy link
Author

Hi Kyle
Thanks for your speedy response!
Below is the .lst output
NM-TRAN MESSAGES

WARNINGS AND ERRORS (IF ANY) FOR PROBLEM 1

(WARNING 2) NM-TRAN INFERS THAT THE DATA ARE POPULATION.

(WARNING 3) THERE MAY BE AN ERROR IN THE ABBREVIATED CODE. THE FOLLOWING
ONE OR MORE RANDOM VARIABLES ARE DEFINED WITH "IF" STATEMENTS THAT DO NOT
PROVIDE DEFINITIONS FOR BOTH THE "THEN" AND "ELSE" CASES. IF ALL
CONDITIONS FAIL, THE VALUES OF THESE VARIABLES WILL BE ZERO.

W Y

(WARNING 79) SIGMA IS USED ON THE RIGHT. WITH A SUBSEQUENT RUN, IF AN
INITIAL ESTIMATE OF A DIAGONAL BLOCK OF SIGMA IS TO BE COMPUTED BY
NONMEM, THAT BLOCK WILL BE SET TO AN IDENTITY MATRIX DURING THAT
COMPUTATION. THIS COULD LEAD TO AN ARITHMETIC EXCEPTION.*

  • THE MAXIMUM NUMBER OF WARNINGS OF ONE OR MORE TYPES WAS REACHED.
    IT IS POSSIBLE THAT SOME WARNING MESSAGES WERE SUPPRESSED.

LIM VALUES MAXLIM ASSESSED BY NMTRAN: 1,2,3,4,5,6,7,8,10,11,13,15,16

Stop Time:
Fri Aug 30 16:03:13 CEST 2024

The model runs perfectly when I executed it with Pirana.

Thanks a lot.

@kylebaron
Copy link
Contributor

Thanks; can you confirm

  • that the model runs (or at least starts) locally in parallel and locally on a single thread
  • the bbi / bbr version you're using
  • how big is the data (approx number of subjects and observations) and the model (approx now many THETAs)

This looks like a $SIZES issue, but that should be largely taken care of with recent bbi versions which set maxlim when running NONMEM. I just want to confirm that first: it should be ruled out if you're using recent bbi which does this.

@LeoLuongVuong
Copy link
Author

  1. As I said, I did not run it locally but via my cluster. It did not support parallelization I think since the run time was the same when I ran it singularly or on multiple cores/threads
  2. I'm using bbr 1.11.0
  3. the dataset has 322 subjects with 14,321 observations. the model has 9 THETAs.

I hope this helps!

@kylebaron
Copy link
Contributor

As I said, I did not run it locally

I'm asking you to run it locally.

@LeoLuongVuong
Copy link
Author

I don't have NONMEM on my PC. That would take some time till I can install it.

@seth127
Copy link
Collaborator

seth127 commented Aug 30, 2024

Hello @LeoLuongVuong . I'm glad to see this discussion ongoing here. I'm jumping in to hopefully provide some clarification points:

  1. Thank for giving us the bbr version, but @kylebaron was also asking for the bbi version. You can get this from the R console with bbr::bbi_version().
  2. I'm a little confused by the discussion of "cluster" vs. "local" execution. I see .mode = "local" in the original call. My guess is that you're running this on a remote server of some kind (i.e. not your laptop) but that is still "local" execution mode, in the sense that it is executing directly on that server. By contrast, .mode = "sge" (the default) would submit this to an SGE queue, typically to run on remote servers in a cluster/grid. All this to say: it looks like you currently are running "locally".
  3. Kyle asks you to try the same model not parallelizing. That would be passing .bbi_args = list(parallel = FALSE, overwrite = TRUE) (which will only use a single thread/CPU). Let us know if that runs successfully.

As Kyle noted, it seems like you may have a NONMEM issue (potentially related to SIZES or maxlim). That said, if you get that sorted and you're interested in some more background reading on parallelizing in bbr, these two articles might be useful:

Best of luck, and thanks for jumping in so quickly Kyle!

@LeoLuongVuong
Copy link
Author

Thanks a lot, @seth127 for your detailed explanation! It appears much clearer to me now. Also, sorry for overlooking your questions @kylebaron. So, to provide more info:

  1. my bbi version is 3.3.0
  2. you are absolutely right! I am indeed running "locally"
  3. It seems like it does run successfully since I see there are way more outputs being generated, so that's great. Do you know why it didn't work with more threads?

Regarding the NONMEM issue, do you have any advise on how to solve that?

Many thanks again for your timely support!

@LeoLuongVuong
Copy link
Author

Hi Kyle and Seth

I would like to open this issue again since I ran into the same problem, but this time when using NONMEM locally on my PC.

Specifically, I seem to have size issue "LIM VALUES MAXLIM ASSESSED BY NMTRAN: 1,2,3,4,5,6,7,8,10,11,13,15,16 " although I used the most recent bbi version: 3.3.0.

My synxtax is this:
submit_model(
mod1,
.mode = "local",
.bbi_args = list(parallel = FALSE, overwrite = TRUE),
.wait = FALSE # not needed if set in bbi.yaml
)

Can you give me some suggestion on how to solve this?

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants