Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there any way to provide custom IORs? #29

Open
razeh opened this issue Apr 4, 2019 · 11 comments
Open

Is there any way to provide custom IORs? #29

razeh opened this issue Apr 4, 2019 · 11 comments

Comments

@razeh
Copy link
Contributor

razeh commented Apr 4, 2019

For various reasons, I've got a custom format that I want wandio to read. Is there a way to provide a custom ior that wandio can then use?

@alistairking
Copy link
Contributor

That's an interesting idea.
I can't think of any way to do it with the current interface short of patching wandio to add your custom ior.
Are you using libwandio directly or via libtrace?

@scherepanov
Copy link

Hi Robert,
Never thought I will be editing your code.
I also have custom format.
But, it was designed in a way to work with standard things already in wandio.

@razeh
Copy link
Contributor Author

razeh commented Apr 4, 2019

@alistairking I'm using it via libtrace.

@alistairking
Copy link
Contributor

Yeah, thats what I figured.
That will make it harder because you somehow have to associate some URI prefix with a custom ior module.
We have been talking about adding an extended _create function to wandio that allows one to pass in a config object, which could have included some callback pointers for dynamically implementing a custom ior, but this won't help when using it from libtrace.

@razeh
Copy link
Contributor Author

razeh commented Apr 5, 2019

Would it be easier if the code for figuring out which io_source_t to use was moved into the io_sources_t?

If each io_source had a function pointer that returned 1 if it should be used I think it would make wandio_detect_compression_type more amenable to plugins.

@alistairking
Copy link
Contributor

yeah, that would be a nice improvement.
off the top of my head, one complication could be the nesting of multiple formats (e.g., for http://remote/compressed/file.gz, first an http reader is created and then a gzip reader is created)

@razeh
Copy link
Contributor Author

razeh commented Apr 10, 2019

Just thinking off the top of my head --- if we added an "init" function that took in the filename, and populated the IOR with a priority field, and a success indicator, could we schedule them to handle the nesting of multiple formats?

The IORs that can read raw files -- http and stdio -- would have the highest priority. They'd always be the first to be created. Then any IORs with a lower priority would be added with the higher priority ones as parents.

Would this work? Would we need to add a field that includes the output filename? For example, would we need to know that http://blah.blah/foo.pcap.gz produces foo.pcap.gz, and that gzip produces foo.pcap?

@razeh
Copy link
Contributor Author

razeh commented Apr 16, 2019

I've got a commit here: 114c0e5 that does everything I need, but it does creates two paths for opening up files. I thought it would be a good idea to push this first, and see if looks like the right direction, before doing more work.

@salcock
Copy link
Contributor

salcock commented Apr 17, 2019

My own thoughts on the issue...

@razeh definitely what you've got there would work reasonably well as an interim measure (although any file format detection that relies on the file name / extension make me feel a bit queasy).

I think in the long run if this is something that people are going to want to do on a regular basis then a better way would be to have all format types implemented as dynamically loadable modules. All such modules live in the same directory and when you start up a wandio program then it scans that directory and dlopen's any appropriate modules there. If you want to add your own specific custom format module, compile it and put it in the directory and it should automatically be supported next time you use wandio. See libpacketdump within libtrace as an example of how this might look.

However, this approach would require restructuring all of wandio and is definitely not something I have the time or inclination for right now.

The other thing to bear in mind is that this only really makes sense for custom formats that nobody else will ever use -- if there's any chance that someone else might want to read or write the format in question, then it is probably better to just integrate the support for it into libwandio directly. Even if you don't want to then merge that support back into the main repo, you could assign a compression type of >1000 to your method and minimise any likelihood of conflicting with later "official" additions to the library (barring git conflict detection failures, which admittedly will happen more than they should).

So after all that rambling, what I'm trying to say is that there are a few ways we could go about this and I'd be interested to hear what people think would work best for them and their use cases.

@razeh
Copy link
Contributor Author

razeh commented Apr 17, 2019

We try to avoid using dynamic libraries because we've found that code runs slower when compiled with -fpic, and we're performance sensitive.

@razeh
Copy link
Contributor Author

razeh commented Apr 18, 2019

I did think about restructuring the existing IORs to use pre-init and open instead of the current setup, but I stopped myself when I realized I was doing something I didn't need. However, I did get far enough that the pre_init function pointer takes a base IOR to examine content instead of filenames.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants