Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

corrupted size vs. prev_size error when using -mutsel #23

Open
berkalpay opened this issue Feb 24, 2021 · 0 comments
Open

corrupted size vs. prev_size error when using -mutsel #23

berkalpay opened this issue Feb 24, 2021 · 0 comments

Comments

@berkalpay
Copy link

Following the command mpirun -n 15 pb_mpi -d ../aligned_RNA_seqs_postprocessed.phylip -cat -gtr -mutsel run02, I get the following error:

--------------------------------------------------------------------------
WARNING: There was an error initializing an OpenFabrics device.

  Local host:   compute-a-16-46
  Local device: mlx4_0
--------------------------------------------------------------------------

model:
stick-breaking Dirichlet process mixture (cat)

read data from file : ../aligned_RNA_seqs_postprocessed.phylip
number of taxa  : 1139
number of sites : 711
number of states: 4

chain name : run02
run started

[compute-a-16-46.o2.rc.hms.harvard.edu:11425] 14 more processes have sent help message help-mpi-btl-openib.txt / error in device init
[compute-a-16-46.o2.rc.hms.harvard.edu:11425] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
*** Error in `pb_mpi': corrupted size vs. prev_size: 0x0000000004e03120 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x7f7c4)[0x7fca5756c7c4]
/lib64/libc.so.6(+0x82fd4)[0x7fca5756ffd4]
/lib64/libc.so.6(__libc_malloc+0x4c)[0x7fca57572adc]
/n/app/gcc/6.2.0/lib64/libstdc++.so.6(_Znwm+0x18)[0x7fca5807ecd8]
pb_mpi[0x4de838]
pb_mpi[0x49c180]
pb_mpi[0x4e66a5]
pb_mpi[0x488cb9]
pb_mpi[0x404d9b]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7fca5750f505]
pb_mpi[0x421927]

followed by a memory map.

The error occurs before the first MCMC iteration but after the 0th iteration has been written to the .trace file. Strangely, the error occurs very frequently but not always when running the command. It also occurs with a variety of settings of -n.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant