You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Parallel parsing for large RIB dumps can significantly improve the data processing speed. However, we will need to determine the best way to approach this issue.
It is not as simple as breaking down a RIB dump file by given chunk size and process them separately. Each MRT record varies in size, and cutting record in parts makes it hard to reconstruct the message. It is also not very straightforward to locate the start of a MRT message as it doesn't have a BGP 32-bit marker.
We should first test how fast we can decompress and read through a MRT dump file without parsing them. This basically involves reading only the head and skip past the number of bytes indicated by the MRT header. The idea is that if we can read a file significantly faster without parsing, we can gather around chunks of "unparsed" raw bytes and send it over to a pool of processing threads for processing. Ideally, the pace of it gathering raw bytes and sending it over to thread pool should be significantly faster than the pace of any individual thread processing it.
The downside of this approach is that it is essentially still dependents on how fast the program can read through bytes of a dump file sequentially. We will need some testing to justify this approach first.
The text was updated successfully, but these errors were encountered:
Parallel parsing for large RIB dumps can significantly improve the data processing speed. However, we will need to determine the best way to approach this issue.
It is not as simple as breaking down a RIB dump file by given chunk size and process them separately. Each MRT record varies in size, and cutting record in parts makes it hard to reconstruct the message. It is also not very straightforward to locate the start of a MRT message as it doesn't have a BGP 32-bit marker.
We should first test how fast we can decompress and read through a MRT dump file without parsing them. This basically involves reading only the head and skip past the number of bytes indicated by the MRT header. The idea is that if we can read a file significantly faster without parsing, we can gather around chunks of "unparsed" raw bytes and send it over to a pool of processing threads for processing. Ideally, the pace of it gathering raw bytes and sending it over to thread pool should be significantly faster than the pace of any individual thread processing it.
The downside of this approach is that it is essentially still dependents on how fast the program can read through bytes of a dump file sequentially. We will need some testing to justify this approach first.
The text was updated successfully, but these errors were encountered: