-
Notifications
You must be signed in to change notification settings - Fork 288
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Duality audio upgrade #636
Comments
Had a look into OpenTK regards the audio system. Points 3 and 4 can be solved and implemented with ease in the current system. Let's see what they have:
There are a number of effects supported by OpenTK. The full list seems to be:
From this list, the ones @ChristianGreiner mentioned are more useful for gamedev purposes. struct AudioEffectSettings {
struct ReverbSettings = null;
struct EchoSettings = { DelayTime = 5.0f, /* other parameters of echo */ }
// other effect types
} There is yet another consideration: if my understanding is correct, OpenTK can only apply these effects/filters on its internal sound sources. At the moment each of the Duality sound sources are mapped to a OpenTK sound source. If #482 was implemented somehow, these per-source effects wouldn't be available anymore. |
Here is a proposal of a possible course of implementation:
Details of the audio plugin:
Any input on this? |
Agreed on both points. However, implementing the whole effects range seems like something to be used primarily in scenarios where you'd also require or use a more complex channel / mixer setup - so I'd defer this part for now.
Sounds very neat, and yes, that would likely be a software implementation that really works on the actual audio samples manually - or at least I'm unaware of any matching OpenAL functionality. As such, this would be quite a bit of work, and should be treated separately from the other points. One thought I'd like to add here as well: Right now, we have good control over rendering sources and targets - it's just a few lines to render a camera onto a texture, or scene A into texture A and scene B into texture B. When approaching channels and mixers, it would be nice to have the same functionality for audio as well, the ability to "render audio into a texture" and to control where audio output ends up, and what to do with it. The concept of an output channel could be the key ingredient to this. |
So my point was not to actually implement the effects, but to expose the OpenTk effects. These use a similar pattern to the currently exposed This is a pre-requisite for the future implementation of the audio plugin, which would use the Core's
It reads like some sort of routing option for the channels. That means, that a channel's output can be another channel's input or the master channel (practically the application's audio output). We can go far with this... 🙂 |
Alright, makes sense. If we really support a bigger range of effects, we need to be careful figuring out the API we expose for that, as I'd like to follow an "open set" approach here, where we might, at some point, allow users to define and implement their own effects in software and use them just the same as the builtin ones. It's a bit different from the settings struct approach, but I'm not sure we should really stuff every possible effect layer into one big item anyway - currently I'm thinking something along the lines of a list of effect instances that are applied in sequence. For OpenAL native effects, they'd just be carrying parameters along to pass over.
Not necessarily - I think it might make sense to actually depend the experimental audio plugin directly on the audio backend, with no detour through nested SoundInstances, especially since that plugin would define its own instances and emitters anyway. Working directly on the audio backend also allows, for example, directly streaming software-generated samples to OpenAL, which would be in line with what I think would be required for custom channel / mixer audio processing.
Yes! Exactly. It would also allow to contain a scene's audio output within a scene's own global channel, like it can already be done with rendering a scene's graphics output to a texture - almost a natural extension of the self-contained scenes idea from #504. We'd need to figure out what exactly a channel and a mixer is, and what other classes an structures take part in this, and how it's all interconnected. Ideally, we could lay out the basic design first and gradually add functionality. On the downside, this pretty much means implementing our own audio system and using OpenAL only to output the final mix as a platform independent layer. If we see channels as fully contained audio streams, we'd even need to do the whole 3D audio deal manually. Thinking of it that way, we probably should find some middle ground. What would be the smallest possible feature set of channels and mixers? How would that look like? Would it make sense to see channels not as audio data streams, but as hierarchical audio source parameter groupings of some kind? |
Points 2 and 5 are usually combined into one solution by effectively making the area of the min distance the area where the source's volume isn't affected by the listener position. Additionally, blending between the 3D positioning and simple 2D playback via distance or a custom curve allows for an area where the listener's position doesn't affect the positioning of the audio source.
This sounds like emulating directivity patterns of a sound source. In the "real" world, a sound source's directivity pattern not only describes its direction-dependent volume but also its direction-dependent frequency spectrum. Furthermore, most sound sources don't have a "static" directivy pattern. E.g., a human making a vocal like an "a" has a different directivity pattern than a human making a sharp "s". Usually, audio middlewares restrict themselves to a static directivity pattern, where the sound designer can define a direction-dependent volume attenuation and a direction-dependent filter, with the possibility to alter any parameter of any available effect based on the listener's position around a sound source.
In which way are these features mutually exclusive?
Having an audio mixer where each channel can be routed into another channel or different mixer (without looped dependencies, of course) would be extremely helpful for sound designers. Some more features for the mixer could be:
However, as it has already been pointed out, 3D positioning needs to happen before the mixer. That leaves you with two choices:
EDIT: OpenTK already seems to be using OpenAL Soft on Windows: andykorth/opentk#18 (comment)
I think implementing the 3D positioning and distance-based parameter modulation effects can be done efficiently in C#. Some more feature ideas:
|
We're using a fork of OpenTK and handle this part manually in the platform backend of Duality: The default is the systems native OpenAL, but there's a special case for Windows to check whether a system implementation is available and fall back to OpenAL Soft, if the answer is no. It's there, but mainly as a fallback.
Ah, makes sense - if we're doing both effects and direction dependence, we could combine the two to specify not only volume, but also effect parameters in a direction dependent way. Random thought: From an audio designer's perspective, would there be any value in skipping the entire effects parameterization and instead having two audio sources that are crossfaded depending on the cone angle? It would allow to apply arbitrarily complex filtering and effects externally (in the audio software of your choice) and maximum artistic freedom, although still limited to an "inside the main direction" / "outside the main direction" distinction. |
I see. I also checked the OpenAL specification to see if it supports feeding the output of a source into another source (which could be used to simulate a mixing engine) but found nothing in that regard. Maybe you can find something like that in there? It has come to my attention that Unity is using an AOT compiler (they call it "Burst" compiler). Is there something similar available that we could use? It may help reducing the latency when doing the audio mixer + audio effects in C# and thus improving performance considerably.
Yes, but I makes more sense to not restrict that to the cone angle. Layering is a general and broad concept and should not be restricted to a specific parameter. Here is an example of how it works in audio middlewares: A crucial part with this technique is timing. Starting multiple sources at once needs to guarantee that they start exactly the same time/buffer/sample position. If not, phase effects will happen, most likely not to the enjoyment of the audio designer. |
The OpenAL effects can be applied per-audio-source. If we use the native audio sources for the bus channels, it's only possible to set the effect on a bus, not on a single audio source. (Of course as a workaround every source can be assigned to exactly one channel with the desired effects.)
So as I can see, the conversation went quite far from exposing some parameters to implementing an audio workstation to Duality. I think at this point we should also consider switching to a different audio backend, which perhaps provides more functionality that we need. Unfortunately there's not a huge load of open-source options, among them SoLoud seems to be the best. It's a middleware itself, supporing various backends (including OpenAL). Also it has
In my opinion, this project could be a valuable resource, used directly as well as a source code reference. |
I'm unaware of #482 having any involvement with bus mixing structures. The best implementation to #482 won't change a thing in how the current systems interact with each other. It boils down to some averaging between virtual audio listeners affecting the audio source's parameters in Duality.
Why should the audio emitter's volume have any effect/change on the way it is mixed in the mixer system? Wouldn't it be simpler to treat this level-information at the sound source's 3D positioning/attenuationing stage than in the mixer stage? Yes, SoLouds features are impressive. However, their OpenAL backend is probably not what one would expect. From the description:
|
Sure it's possible with that setup. What I meant originally, that if we redirected all listeners to a single native audio source, then we wouldn't get the native effects on each of the emitters.
Yes, you are right about that. The idea was that the zones could have their own effect section defined as well - but let's drop this for now. |
C++ portability is kind of a different beast than C# portability. Duality is 100% managed C# right now, and introducing C++ would make portability a lot more complicated - we should avoid this.
Okay, I think we need to keep an eye on the scope of these changes blowing up. Let's scale down this goal a bit - an entire audio workstation would probably be overkill, but we can definitely improve both low-level and high-level API and functionality that Duality provides. So far, we have identified some general topics to look into regarding bigger / mid- or long term changes in the audio system, most of which exist independently of each other:
Some of those items are quite big, so we should try to identify common prerequisites and self-contained features, to find a way to gradually progress in smaller increments. |
Absolutely! E.g. if the mixing is done in Duality/C# and OpenAL just gets the master sum of that, the OpenAL audio plugins need to be ported to C# or we code the plugins ourselves.
Actually, that's a cool idea. I would avoid live instancing these "area" effects. Instead, the sound designer could setup specific effect parameter settings per zone and upon overlapping or entering/exiting these zones, transitioning between these effect parameter settings begins. |
Like @Schroedingers-Cat said, the base idea was that you can define different zones in which the sound or music has the same volume. (like singing birds or wind noise etc..).
This would be really cool! +1 for that |
The way I see it, the following features can be implemented without rewriting existing OpenAL features:
However, adding more control over the signal flow by implementing an audio mixer means a lot of existing OpenAL functionality needs to be rewritten. This also affects OpenAL audio effects. Also, performance of OpenAL vs the re-implementation in C# will probably be an issue. Some more thoughts:
|
It boils down to improving the high level audio API, so users can implement advanced features themselves without help from the core side. Use cases that came to mind are:
Right now, the high level audio API that Duality provides is somewhat limited to the most common use cases: You play some sound in 3D or 2D, it can move around, change pitch, get a lowpass, it can loop and fade in and out, stream OGG audio data, and other base functionality. However, if you want to access audio data directly, there's no easy way to do it, as it all happens behind the scenes. The closest you can get is by abandoning the high level API and using the low level one directly, but that also means you lose all of the above mentioned functionality unless you rewrite it from scratch. So what I'd like to do is improve and extend the high level API to allow users to interact directly with played audio data more easily.
Yep, let's skip that for now. I still like the idea, but it's probably one of the biggest chunks we listed so far and I think it makes sense to take some time to consider our options. Also wouldn't rule out the idea of seeing channels as hierarchical parameter groupings just yet, since that would remove the need for rebuilding what OpenAL already does entirely - which would be a big plus.
Generally speaking, introducing any new dependency is something we should only do when absolutely necessary, especially when they contain any native code. Given the required work, maintenance and portability impact vs. what would be to gain, I'm not convinced we should add any of those libraries so far. Adding libraries is something any user of Duality can do, so I'd instead put the focus on making sure that these users can do something useful with those libraries and Duality. That's where an improved audio API becomes a key point. |
It seems that we have a general direction and a list of user needs which seem to be feasible to implement. The next step would be to agree on some general architecture. The following diagram is a proposal on this, up to debate:
|
Great overview, the diagram really helps too 👍 Let's refine this design a bit.
As far as I understand it, the multiple listener example would need some core interaction as well - since the audio backend operates in a single-listener context, and the averaging algorithm needs to access individual playing instances, it would need to work on (core) Audio Emitters, not the Native Audio Listener. The backend can remain "dumb" / simple and thus, easy to port, while the core does the more complex stuff.
There are two points I don't yet see accounted for here:
One way to address this would be to mimic the rendering upgrade that v3 got and expose native audio buffer handling to the core and plugins using a new Also, some naming change requests
Adding some links on how the audio stuff works in Duality right now, for reference to anyone who might join in:
This is the status quo that we're working on to improve. I'll have to cut it short for time reasons right now, hopefully more next time 🙂 |
Here are some ideas (brainstorming) how we could improve the audio features of Duality.
1. Support Multiple SoundListeners at Once
See #482
2. Adjust the volume via Gizmo
Set the min. / max. distance of the audio emitter via scene editor like scaling an object:
3. Cone Angles
4. Audio Effects
Add audio effects to a sound emitter (or sound instance ?)
5. Audio Areas
Define audio areas (like shape of rigidbody ?) so the audio emitter gets the same volume everywhere in this area.
So do u have further proposals and ideas?
The text was updated successfully, but these errors were encountered: