Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in 'vg autoindex' a GFA file derived from PGGB #3712

Closed
fangbohao opened this issue Aug 4, 2022 · 10 comments · Fixed by #4010
Closed

Error in 'vg autoindex' a GFA file derived from PGGB #3712

fangbohao opened this issue Aug 4, 2022 · 10 comments · Fixed by #4010

Comments

@fangbohao
Copy link

1. What were you trying to do?
I am trying to index a GFA graph file (a chromosome) derived from PGGB.

2. What did you want to happen?
index done.

3. What actually happened?
error message appears as above.

4. If you got a line like Stack trace path: /somewhere/on/your/computer/stacktrace.txt, please copy-paste the contents of that file here:

Crash report for vg v1.41.0 "Salmour"
Stack trace (most recent call last):
#24   Object "/n/home00/bfang/.conda/envs/fasrc/bin/vg", at 0x5df43d, in _start
#23   Object "/n/home00/bfang/.conda/envs/fasrc/bin/vg", at 0x1e520cf, in __libc_start_main
#22   Object "/n/home00/bfang/.conda/envs/fasrc/bin/vg", at 0x5b08ce, in main
#21   Object "/n/home00/bfang/.conda/envs/fasrc/bin/vg", at 0xd3347b, in vg::subcommand::Subcommand::operator()(int, char**) const
#20   Object "/n/home00/bfang/.conda/envs/fasrc/bin/vg", at 0xc1237c, in main_autoindex(int, char**)
#19   Object "/n/home00/bfang/.conda/envs/fasrc/bin/vg", at 0xf41d48, in vg::IndexRegistry::make_indexes(std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocato
r<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&)
#18   Object "/n/home00/bfang/.conda/envs/fasrc/bin/vg", at 0xf2dde8, in vg::IndexRegistry::execute_recipe(std::pair<std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std:
:allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, st
d::allocator<char> > > >, unsigned long> const&, vg::IndexingPlan const*, vg::AliasGraph&)
#17   Object "/n/home00/bfang/.conda/envs/fasrc/bin/vg", at 0xf2d7fd, in std::_Function_handler<std::vector<std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::alloc
ator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, std::allocator<std::vector<std::__cxx11::basic_string<char, std::char_tra
its<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > > (std::vector<vg::IndexFile const*, std::allocator
<vg::IndexFile const*> > const&, vg::IndexingPlan const*, vg::AliasGraph&, std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11
::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&), vg::VGIn
dexes::get_vg_index_registry()::{lambda(std::vector<vg::IndexFile const*, std::allocator<vg::IndexFile const*> > const&, vg::IndexingPlan const*, vg::AliasGraph&, std::set<std::__cxx11::b
asic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11:
:basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&)#15}>::_M_invoke(std::_Any_data const&, std::vector<vg::IndexFile const*, std::allocator<vg::IndexFile const*
> > const&, vg::IndexingPlan const*&&, vg::AliasGraph&, std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char
, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&)
#16   Object "/n/home00/bfang/.conda/envs/fasrc/bin/vg", at 0xf2d346, in vg::VGIndexes::get_vg_index_registry()::{lambda(std::vector<vg::IndexFile const*, std::allocator<vg::IndexFile con
st*> > const&, vg::IndexingPlan const*, std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_trai
ts<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&)#11}::operator()(std::vector<vg::IndexFile co
nst*, std::allocator<vg::IndexFile const*> > const&, vg::IndexingPlan const*, std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cx
x11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&) const 
[clone .isra.0]
#15   Object "/n/home00/bfang/.conda/envs/fasrc/bin/vg", at 0x1318ce0, in vg::algorithms::gfa_to_path_handle_graph(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<
char> > const&, handlegraph::MutablePathMutableHandleGraph*, long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
#14   Object "/n/home00/bfang/.conda/envs/fasrc/bin/vg", at 0x1316855, in vg::algorithms::gfa_to_path_handle_graph(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<
char> > const&, handlegraph::MutablePathMutableHandleGraph*, vg::algorithms::GFAIDMapInfo*, long)
#13   Object "/n/home00/bfang/.conda/envs/fasrc/bin/vg", at 0x126cb90, in vg::get_input_file(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::f
unction<void (std::istream&)>)
#12   Object "/n/home00/bfang/.conda/envs/fasrc/bin/vg", at 0x131f4ac, in std::_Function_handler<void (std::istream&), vg::algorithms::gfa_to_path_handle_graph(std::__cxx11::basic_string<
char, std::char_traits<char>, std::allocator<char> > const&, handlegraph::MutablePathMutableHandleGraph*, vg::algorithms::GFAIDMapInfo*, long)::{lambda(std::istream&)#1}>::_M_invoke(std::
_Any_data const&, std::istream&)
#11   Object "/n/home00/bfang/.conda/envs/fasrc/bin/vg", at 0x131dfc9, in vg::algorithms::GFAParser::parse(std::istream&)
#10   Object "/n/home00/bfang/.conda/envs/fasrc/bin/vg", at 0x131c367, in vg::algorithms::GFAParser::parse(std::istream&)::{lambda()#3}::operator()() const
#9    Object "/n/home00/bfang/.conda/envs/fasrc/bin/vg", at 0x1318f12, in vg::algorithms::add_path_listeners(vg::algorithms::GFAParser&, handlegraph::MutablePathMutableHandleGraph*)::{lam
bda(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::pair<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_
traits<char>, std::allocator<char> > >, __gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, std::pair<__g
nu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, __gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_str
ing<char, std::char_traits<char>, std::allocator<char> > > > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx
11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&)#2}::operator()(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, st
d::pair<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, __gnu_cxx::__normal_iterator<char const*, std::__cxx11
::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, std::pair<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits<char>
, std::allocator<char> > >, __gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, std::vector<std::__cxx11:
:basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&) const [clone
 .isra.0]
#8    Object "/n/home00/bfang/.conda/envs/fasrc/bin/vg", at 0x17ee688, in handlegraph::PathMetadata::parse_path_name(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocato
r<char> > const&, handlegraph::PathSense&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, std::__cxx11::basic_string<char, std::char_traits<char>, std::
allocator<char> >&, unsigned long&, unsigned long&, std::pair<unsigned long, unsigned long>&)
#7    Object "/n/home00/bfang/.conda/envs/fasrc/bin/vg", at 0x17eea10, in long long __gnu_cxx::__stoa<long long, long long, char, int>(long long (*)(char const*, char**, int), char const*
, char const*, unsigned long*, int)
#6    Object "/n/home00/bfang/.conda/envs/fasrc/bin/vg", at 0x5af280, in std::__throw_invalid_argument(char const*)
#5    Object "/n/home00/bfang/.conda/envs/fasrc/bin/vg", at 0x1d8e148, in __cxa_throw
#4    Object "/n/home00/bfang/.conda/envs/fasrc/bin/vg", at 0x1d8dfe6, in std::terminate()
#3    Object "/n/home00/bfang/.conda/envs/fasrc/bin/vg", at 0x1d8df7b, in __cxxabiv1::__terminate(void (*)())
#2    Object "/n/home00/bfang/.conda/envs/fasrc/bin/vg", at 0x5ad45a, in __gnu_cxx::__verbose_terminate_handler() [clone .cold]
#1    Object "/n/home00/bfang/.conda/envs/fasrc/bin/vg", at 0x5afdf7, in abort
#0    Object "/n/home00/bfang/.conda/envs/fasrc/bin/vg", at 0x145d3ab, in raise

5. What data and command can the vg dev team use to make the problem happen?

6. What does running vg version say?

vg v1.41.0
@fangbohao
Copy link
Author

Some big chromosomes work well with 'vg autoindex', but small chromosomes did not work properly, occurring issues above.

@jeizenga
Copy link
Contributor

jeizenga commented Aug 4, 2022

Can you provide the command line call that you ran into this error on?

@fangbohao
Copy link
Author

fangbohao commented Aug 4, 2022 via email

@fangbohao
Copy link
Author

fangbohao commented Aug 4, 2022 via email

@jeizenga
Copy link
Contributor

jeizenga commented Aug 4, 2022

@adamnovak This looks to me like it's running into a problem in the named-node stuff you implemented. Could you take a look?

@ASLeonard
Copy link

I came across this issue when using panSN-spec named input like

ARS_UCD12#hap0#6

but there is a stoll call on the haplotype, so should just be numeric (i.e. "ARS_UCD12#0#6"). Not sure if this was causing the same issue, but I got very similar crash log.

I couldn't find clear documentation on the pathsense API, but from vg paths -Mv it looks like it expects further groupings than panSN-spec? Is it possible to denote e.g. a primary assembly path vs a haplotype-resolved path or will everything need the sample ploidy to work?

Best,
Alex

@ASLeonard
Copy link

Found the [path metadata model[(https://github.com/vgteam/vg/wiki/Path-Metadata-Model) (I knew I had stumbled on it before), so will try with this a bit further

@adamnovak
Copy link
Member

adamnovak commented Jun 30, 2023

Unfortunately I can't get @fangbohao's file; it looks like it's a Google Drive upload shared with a specific list of people that I'm not on.

But it does seem like a path like ARS_UCD12#hap0#6 might be able to cause a crash in __gnu_cxx::__stoa (which is the string-to-number converter) inside path name parsing.

By my reading of the panSN spec that I had when I wrote the path name parsing, that isn't valid panSN because the haplotype piece hap0 is a string; I thought only numbers were allowed there. Maybe that isn't really true?

Whether that's true or not, we should produce a more useful error when we can't parse the path name.

@jeizenga
Copy link
Contributor

FWIW, the spec does indeed say here that haplotype ID is a number.

@adamnovak
Copy link
Member

OK, @fangbohao shared the file with me, and I tested my fix, and I now have vg interpreting it like this:

[anovak@swords vg]% vg paths --metadata -x ~/Downloads/VGP\#prim\#SUPER_37.pan.fa.gz.3051141.04f1c29.ecbf8cf.smooth.final.gfa
#NAME	SENSE	SAMPLE	HAPLOTYPE	LOCUS	PHASE_BLOCK	SUBRANGE
MA_2#hap2#h2tg000495l	GENERIC	NO_SAMPLE_NAME	NO_HAPLOTYPE	MA_2#hap2#h2tg000495l	NO_PHASE_BLOCK	NO_SUBRANGE
WA_2#hap1#h1tg000618l	GENERIC	NO_SAMPLE_NAME	NO_HAPLOTYPE	WA_2#hap1#h1tg000618l	NO_PHASE_BLOCK	NO_SUBRANGE
NM_1#hap2#h2tg000401l	GENERIC	NO_SAMPLE_NAME	NO_HAPLOTYPE	NM_1#hap2#h2tg000401l	NO_PHASE_BLOCK	NO_SUBRANGE
AZ_2#hap2#h2tg000020l	GENERIC	NO_SAMPLE_NAME	NO_HAPLOTYPE	AZ_2#hap2#h2tg000020l	NO_PHASE_BLOCK	NO_SUBRANGE
CA_1#hap1#h1tg001701l	GENERIC	NO_SAMPLE_NAME	NO_HAPLOTYPE	CA_1#hap1#h1tg001701l	NO_PHASE_BLOCK	NO_SUBRANGE
CA_1#hap2#h2tg004194l	GENERIC	NO_SAMPLE_NAME	NO_HAPLOTYPE	CA_1#hap2#h2tg004194l	NO_PHASE_BLOCK	NO_SUBRANGE
CA_2#hap2#h2tg002977l	GENERIC	NO_SAMPLE_NAME	NO_HAPLOTYPE	CA_2#hap2#h2tg002977l	NO_PHASE_BLOCK	NO_SUBRANGE
...

It's not parsing it as the file writer intended, I don't think, but it is parsing it to something we can represent. For the file to really work properly (and not result in a possibly unmanageable number of named paths), hap1 and hap2 need to be changed to just 1 and 2. But with #4010 we should at least no longer crash like this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants