-
Notifications
You must be signed in to change notification settings - Fork 207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix WAV and add RIFF #229
Fix WAV and add RIFF #229
Conversation
Thank you for the contribution. It is a long-lasting tech debt to unify all riff-based formats under a riff.ksy spec, but it requires multiple features lacking from KSC currently: kaitai-io/kaitai_struct#81, kaitai-io/kaitai_struct#458, and maybe kaitai-io/kaitai_struct#135 and kaitai-io/kaitai_struct#196. You may also want to add https://www.johnloomis.org/cpe102/asgn/asgn1/riff.html into and |
common/riff.ksy
Outdated
- id: form_type | ||
type: str | ||
size: 4 | ||
pad-right: 0x20 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure that we need this pad-right
. pad-right
is the contract that all the strings have padding stripped here. But the strings are 4 bytes, 4 bytes is 1 dword, one dword is a unit for reading memory on 32-bit archs and is faster than reading 3 bytes. The strings are fixed length. There is no chance that there will be arbitrary long padding. So IMHO pad-right
here doesn't make any sense and would have prevened KSC from optimizing the code if such optimizations ever were introduced.
There are two types of RIFF structs: with asserts (chunk and parent_chunk_data) and without asserts (chunk_generic and parent_chunk_data_generic). Those without asserts won't be needed when kaitai-io/kaitai_struct#435 (comment) is resolved.
@KOLANICH I think it's now quite readable and less error-prone. Silly mistakes like this one (the marked line also causes kaitai-io/kaitai_struct#604) should no longer happen if all RIFF-based formats use the common classes as containers for specific data. I would continue with the other RIFF formats if nobody objects. The riff.ksy format can be used for exploring chunk structure of any RIFF-compliant binary file. It is generic, it only implements INFO list chunk for which RIFF specification says the following: "The ‘INFO’ list is a registered global chunk that can be used within any RIFF file." I don't expect that someone would use it in production, but it is quite handy for development. |
I wonder if we really need to assert chunk ids instead of just switching on them everywhere? I mean just read a chunk id and interpret the blob based on the read chunk id only, without any checks wheter this chunk type is allowed to be here. |
You are right, thank you the suggestion! I haven't realized that the assertions are not vital. Is it now OK? |
@KOLANICH, can this PR be already merged? |
I'm sorry, but I am not the one who merges PRs into this repo. @GreyCat is the one who decides and merges. |
Actually, I can merge it myself, because @GreyCat trusts me that I won't do silly things 😏, but I hesitate to do so when I'm not 100% sure that it is OK. I wonder why you don't want to enter the Kaitai team, you belong there more than everyone else. I understand that you might not want to feel the pressure to somehow deserve the membership (e.g. at least one contribution per week 😂), but I think everyone understands that you're busy person and you can participate only once in a while. |
AFAIK there are no such requirements. |
Nothing is ever secure, but I can't imagine the degree of boredom of a hacker that would bother with screwing up an open-source repo like this, but thanks for the info 🙂 |
access to an org gives access to all the repos. this repo is probably safe, |
@GreyCat, can it be finally merged? It's blocking my |
Apologies for not reviewing this earlier. I totally agree on the motivation and approach, but there are two things that I've flagged:
What do you think? |
At first, I meant the
At time of writing the specs, I was experimenting with the enum implementation as well. The problem was it didn't come to my mind that I can use But yeah, you're right, string conversion and comparison is inefficent compared to a primitive integer read, and the inconvenience of working with plain numbers can be gracefully solved with converting the integer to format-specific (wav, avi, ...) chunk ID dictionary enum just using a value instance with the I'll go ahead and try to incorporate the suggestions. |
All the latest changes look great, thanks! However, there are currently 2 conflicting fragment. Can I ask you to take a look and resolve these conflicts, so we can merge this one? |
Yeah, I no longer hoped that this PR can be merged in a reasonable amount of time, so I've recently fixed the WAV format manually so that it at least works (it was broken due to kaitai-io/kaitai_struct#604 and merging this PR would also fix it). Thanks for the review, please merge it if all looks good to you 🙂 |
Thanks for this great work! |
Wave sample files: http://www-mmsp.ece.mcgill.ca/Documents/AudioFormats/WAVE/Samples.html
WAV with bext chunk: http://www.noiseofnorway.net/wp-content/uploads/2011/11/170139-Printer-Scanner-close-scanner-lid-Schoeps.wav
RIFF description can be tested with any of its subtypes: WAV, AVI, RMID, DLS, SF2, ...