-
-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Panic on BGP4MP_ET File #31
Comments
Probably related: CAIDA/libbgpstream#211 You (@digizeph ) seem to have encountered this issue before. |
I see, this has been fixed in the latest version, sorry. |
No worries! Thanks for check it out! |
After some checking, there seems to be other issues regarding parsing this file. I am investigating now, and will reopen this issue. The problem I see:
|
There seems to be many messages where NLRI field is somehow having strange
|
Yes, I ran some more tests and the count really does not make much sense. However, no parser I tested seems to agree on the output. The text output of bgpdump and bgpreader are very large (955MB and 1.3GB). bgpscanner is much smaller with only 170MB and finally bgpkit only returns 103 elements as you already wrote. The line counts are also all different.
Maybe this file is just unrecoverably malformed? |
I am still investigating the message, but I do see a lot of warnings from my side, mostly on the NLRI values. The attributes parsing, on the other hand, seems to be OK. Note that the attributes are placed before NLRI in the MRT files, and both have dedicated 2-byte field indicating the length of each. So it is unlikely that error during parsing of attributes will propagate to parsing of NLRI. The just released v0.6.0 has the following count:
|
Able to isolate one problematic message. The attached is a self-containing MRT record that should be parsable by other parsers as well like a regular MRT file (but only containing one record): |
So, is there any way to handle invalid messages inside the iterator, or are they just silently skipped? Maybe the iterator should be over an |
I am not sure how you would handle a invalid message. Currently, the iterator checks errors from the parser, and if the parser is not critical IO issues, it'll skip the current one and continue parsing the next one. For records that cannot be parsed correctly, we cannot really create
I have run @SZenglein, I'd like to seek your input on this finding. Thanks in advance!
|
A few observation:
|
Has confirmed with RouteViews people that this is an issue with one AMSIX peer sending ADD-PATH data while the MRT collector software does not support ADD-PATH, thus causing the corrupted data. As far as I know, the issue is resolved in more recent MRT files. For the behaviors you see in the corrupted files, I believe our behavior is fine in that that we skip corrupted MRT records instead of producing misleading elements. The massive number of elems you see from other parsers could contain a lot of 0.0.0.0/0 announcements, which could be potentially more misleading. However, I do believe that better error message is needed, and therefore I will add more information regarding parsing issues in the next release. With |
Found a relevant issue with more discussion in depth of the issue: |
I'm not sure if I can help you all that much in debugging these files. I also noticed lots of "null" messages and generally weird results. Reconsidering, I agree that the crate API is fine. A |
This refactor enables parsing of NLRI block with potentially wrong add-path type. In some cases, the BGP message with add-path might be wrapped in a non-add-path message and causing trouble parsing NLRI block in BGP message and in attributes. This patch implements a workaround that can more intelligently "guess" the add-path intention and revert back to regular parsing when guess fails. Here is a brief description of the workaround: - If add-path is not set (i.e. not the add-path BGP msgs) and the first byte of an NLRI entry is 0, it is likely an add-path msg wrapped in non-add-path msg, treat as add-path. This covers both IPv4 and IPv6 cases. - If an error happens when parsing, revert back to the original add-path setting (this only happens for `0.0.0.0/0` or `0::/0` prefixes without add-path, which is pretty rare) and re-parse the whole NLRI block. (needs to go back a few bytes to the beginning). - The unhandled cases are the ones where the first byte of path_id is not zero. In those cases, we will, unfortunately, parse the msg without path-id, and depending on the msg, we might see parsing errors later (e.g. not enough bytes, or extra bytes). Test run show that the parsing of one problematic file is resolved back to normal now. See #31 for more details.
Hey @SZenglein, I've added a fix to this issue in #42, and have been merged into main branch. Could you give in a try when you have some time? Just checkout the main branch and do
Records and elems count now are:
And there is no |
Looks fine to me at a glance. |
I'll close this issue now. Feel free to reopen it if you find further issues on handling similar MRT files. |
Example file: http://archive.routeviews.org/route-views.amsix/bgpdata/2020.06/UPDATES/updates.20200617.1830.bz2
This file crashes the provided example. Bgpdump can read it and reports it as BGP4MP_ET, while bgpscanner also struggles parsing it.
The text was updated successfully, but these errors were encountered: