Expose functions from `AudioStreamPlaybackMicrophone` to access the microphone `AudioDriver::input_buffer` independently of the rest of the Audio system #100508

goatchurchprime · 2024-12-17T13:01:08Z

This solves all the issues raised in the proposal: godotengine/godot-proposals#11347

Changes made:

My new AudioDriver::input_start_count counter is to prevent multiple calls to AudioDriver::input_start() from different AudioStreamPlaybackMicrophone instances that result in multiple conflicting processes to popping data from the same buffer.
Change the return type of AudioDriver::get_input_buffer() to Vector<int32_t>& to avoid copying out entire buffer
to access it.
GDREGISTER_CLASS(AudioStreamPlaybackMicrophone) was missing from register_server_types.cpp
Expose functions start(), stop() and is_playing() from AudioStreamPlaybackMicrophone to GDScript
New function PackedVector2Array AudioStreamPlaybackMicrophone::get_microphone_buffer(int p_frames). This does the same as PackedVector2Array AudioEffectCapture::get_buffer(frames: int) but it fetches the frames directly from the AudioDriver::input_buffer without going through an AudioStream, AudioBus and an AudioEffect, all of which operate on the duty cycle of the Audio system output frequency.

Need help with:

I would like to expose
int AudioStreamPlaybackMicrophone::mix(AudioFrame *p_buffer, float p_rate_scale, int p_frames)
to GDExtension, but this has an AudioFrame* pointer in its parameters list

This ought to be possible, because there is already one like this with int AudioStreamPlaybackResampled::_mix_resampled(dst_buffer: AudioFrame*, frame_count: int),
but it is created by the special virtual function template GDVIRTUAL2R(int, _mix_resampled, GDExtensionPtr<AudioFrame>, int)

Bugsquad edit, closes: A new InputEventMicrophone class to deliver audio chunks of microphone data independently of the Audio system godot-proposals#11347

RedMser · 2024-12-17T13:33:05Z

Since you modified the scripting API, make sure to update the class reference as well and fill it out: https://docs.godotengine.org/en/stable/contributing/documentation/updating_the_class_reference.html#updating-class-reference-when-working-on-the-engine

fire

As we discussed, the design is that instead of the AudioServer using the mix callback somewhere else, I assume a node::_process(delta) will process the mic driver.

I am not convinced by the arguments to bypass the audio server to access the microphone driver directly.

goatchurchprime · 2024-12-17T17:58:33Z

TLDR

There is a long and buggy route for getting microphone audio data out of the engine. It would be shorter and more reliable to extract it directly from the AudioDriver's input_buffer.

What happens now

All microphone data goes into the ring-buffer AudioDriver::input_buffer advancing input_position as it is added.

The AudioStreamPlaybackMicrophone object has its own local pointer input_ofs into this same buffer, and its function _mix_internal() locks the audio device thread before returning a slice from this buffer using read access only.

At the moment the only way to extract microphone data from the engine is to wait for the AudioServer to request chunks of data from an AudioBus at its own update rate. This AudioBus then requests the data from an AudioStreamPlaybackMicrophone filtered through an AudioEffectCapture object that copies any audio that passes through into its own little buffer. Then, on a process loop, you extract the data from this buffer by calling AudioEffectCapture.get_buffer()

There are two problems with this.

Firstly, the AudioServer is highly sensitive to data not arriving in at the rate it requires, so if the microphone buffer isn't filled fast enough the stream is terminated. There is a pull-request to fix the problem on the Android platform where the microphone keeps switching off after two minutes by padding the missing frames with zeros. Not surprisingly, this degrades the quality of the microphone audio.

Secondly, we don't generally want any microphone audio data in the AudioServer because it causes feedback and is too laggy to work as realtime amplifier. So if it is too difficult to debug this buffer properly (which it is), it's not worth it. The evidence is that standard operating procedure requires us to create a special bus for the AudioStreamMicrophone to output to that is set to Mute.

The solution

Expose AudioStreamPlaybackMicrophone::_mix_internal() (suitably wrapped) as an external function, so that any process with a local copy of an AudioStreamPlaybackMicrophone can request slices of data from the input_buffer as it becomes available. This makes no structural change to the engine, but allows a new access route to the microphone data avoiding the unnecessary requirement for it to operate in perfect synchronicity to the Audio streams.

goatchurchprime · 2024-12-21T22:27:42Z

This fix is working well. It ran on my cheap Android phone for 4 hours without any issues at all.

Previously it would last about 3 minutes on this phone before the AudioStreamMicrophone was switched off, often permanently.

Although there is some skepticism, I would like to push this PR into some place where it can be critically discussed and reviewed, because it actually works, and we have don't even have an outline for fixing the Microphone any other way.

The primary function added into AudioStreamPlaybackMicrophone for accessing the audio data is:

PackedVector2Array get_microphone_buffer(int p_frames)

I would prefer to have the additional function:

bool mix_microphone(AudioFrame* buffer, int p_frames);

which would copy the values directly into an array rather than allocating and returning a temporary array by value.

The main use of this interface is the twovoip addon for doing Voice over IP.

There are a numerous virtual functions like this one:

void _process(src_buffer: const void*, dst_buffer: AudioFrame*, frame_count: int) virtual

where the engine calls out to the GDExtension with an array pointer, but no examples I have found where a GDExtension calls into the engine with an array pointer.

My attempts to implement this function using:

bool AudioStreamPlaybackMicrophone::::mix_microphone(GDExtensionConstPtr<AudioFrame> p_buffer, int p_frames);
ClassDB::bind_method(D_METHOD("mix_microphone", "p_buffer", "frames"), &AudioStreamPlaybackMicrophone::mix_microphone);

do not compile, so I have left the call to the bind_method commented out until it's possible to find an answer.

goatchurchprime · 2025-01-03T22:26:21Z

The use of thread locking is inconsistent when accessing the AudioDriver.input_buffer ring-buffunctionfer.

In particular, values are pushed into this ring-buffer by the AudioDriver::input_buffer_write() function from various platforms without a mutex lock, while values are copied from this buffer by the function AudioStreamPlaybackMicrophone::_mix_internal() where it is protected by a lock().

Therefore this locking achieves nothing, and we are getting away with it because either (1) the ring-buffer does not realloc, (2) there is a single audio thread on some platforms, or (3) any issues are mistaken for other bugs in the microphone system.

Confusingly, this ring-buffer has two indexes into it, input_position which is where the data is written, and input_size which is the number of elements in the buffer before the first wrap-around. The conventional method of reading the microphone, using AudioStreamPlaybackMicrophone::_mix_internal(), ignores the input_position value and will happily overtake this index. However, it uses the input_size to the buffer get ahead of the audio system by playback_delay= 50 miliseconds, which is often enough slack to allow things to run for a number of minutes before it mysteriously stops.

My new AudioStreamPlaybackMicrophone::get_microphone_buffer() function however, does use the input_position value which, since it is not locked, could be encountered in an invalid state due to an interruption at line 107 while input_position >= input_buffer.size() is true.

This lack of locking might explain the bizarre if-statement on line 114 protecting this value from being invalid in this way that could only ever happen if two threads were executing this function simultaneously! Such an event -- of two threads reading and posting microphone data into the buffer -- is already a bug, and would be fixed by Change 1 at the top.

To solve this we either need to make this function safe for interruption, by rewriting it like (I don't know if this is safe ? @RandomShaper ):

input_buffer.write[input_position] = sample;
input_position = (input_position % input_buffer.size());

or putting a lock into the AudioDriver::input_buffer_write() function, which would be bad because it would mean the lock function was called up to 96,000 times per second.

A more consistent answer would be to put the lock() into each of the 5 platform functions that copy values into this buffer, such as AudioDriverOpenSL::_record_buffer_callback() for Android.

Since the exact time that we get data from AudioStreamPlaybackMicrophone::get_microphone_buffer() is not critical, it should be protected by a trylock() to avoid holding back the main worker thread.

walksanatora · 2025-01-05T13:11:22Z

person planning on adding VC to a game here.
from what I read (havent touched the PR) this would be a more stable and probally faster way of getting microphone input? which seems like it would be VERY usefull.
I could also imagine this being usefull for Speech To Text and reducing latency on that. (darn now I am considering Waltz Of Wizards verbal casting)

goatchurchprime · 2025-01-05T19:09:53Z

Further issues discovered:

1) The last Pull-Request to the Android stop_input() function leaves it in a state where start_input() cannot work again due to the pointers recordItf and recordBufferQueueItf not being set to null after their objects were deleted. I programming it to destroying the entire recording system, including the recorder object in the stop_input() function for the option of a completely fresh start. (I've experienced other Android apps that require a full power cycle if their microphone stops working, so it might apply to us too.)

2) AudioDriver::input_start() takes an unreasonably long time on Windows, between 600 and 800ms, causing a freeze on startup. Perhaps this should be moved into its own thread.

3) The try_lock() in my AudioStreamPlaybackMicrophone::get_microphone_buffer() returns false every few seconds on Android meaning it's doing some good by not blocking the main thread and instead deferring access to the microphone for an extra frame.

4) There is another starting-the-microphone bug that occurs first time an App is installed onto Android and needs to request permissions. Here is what happens:

AudioStreamPlaybackMicrophone::start() calls AudioDriverOpenSL::input_start() which pops up your Allow to record audio? permissions request dialog box, and returns ERR_UNAUTHORIZED, which means that AudioStreamPlaybackMicrophone.active is set to false.

If you confirm the permission to record, then there is a call-back into Java_org_godotengine_godot_GodotLib_requestPermissionResult() which makes its own call to AudioDriverOpenSL::input_start() now that it is allowed to start reading the microphone. Unfortunately the AudioStreamPlaybackMicrophone.active is still false, so nothing is going to work through it now. There is no pointer from the AudioDriver to the AudioStreamPlaybackMicrophone to report that the microphone is now working.

My preferred solution is to disable the hidden call-back to input_start() and instead attach a function to the SceneTree.on_request_permissions_result signal that tries to enable the microphone a second time from the top if the answer is the affirmative.

fire · 2025-01-06T20:03:08Z

The #85955 (comment) to the Android stop_input() function leaves it in a state where start_input() cannot work again due to the pointers recordItf and recordBufferQueueItf not being set to null after their objects were deleted.

If you made a pr fixing this, you can get the android team to review it.

goatchurchprime · 2025-01-09T11:14:43Z

@m4gr3d @RandomShaper

Need to review:

Consistent use of thread locking on the AudioDriver.input_buffer -- or whether locking is actually necessary if we ensure there is only ever one writer on this ring-buffer that never resizes, and multiple independent readers that manage their own indexes. (The line 114 invalid state can only happen if there are two non-locked threads writing to this buffer!)
Removing the automatic call to AudioDriver.input_start() following the user accepting Android Microphone Permissions that results in the higher level code not knowing when or if the microphone is working. The logic of what to do after user acceptance or rejection can be kicked up to the GDScript level so that the game can know exactly why the microphone is not going to work.
The use of an instantiated copy of AudioStreamPlaybackMicrophone without its corresponding AudioStreamMicrophone to access chunks of audio data directly from the AudioDriver.input_buffer microphone buffer independently of the audio system. (This does not change how anything else functions, because you can have multiple AudioStreamPlaybackMicrophones referencing the single microphone buffer since each one has its own index.)
Proposed to change AudioDriver.input_buffer from an array of floats to an array of AudioFrames.

goatchurchprime · 2025-01-11T18:53:25Z

Minimal project has been added here to the Audio Mic Record Demo https://github.com/goatchurchprime/godot-demo-projects/tree/gtch/micplayback
This gives the option of using the AudioStreamPlaybackMicrophone class if it exists.

Since the get_microphone_buffer() function returns a PackedVector2Array I have to convert it to the WAV bytes using the following code.

recording = AudioStreamWAV.new()
var data = PackedByteArray()
data.resize(4*len(chunks)*chunksize)
for j in range(len(chunks)):
	for i in range(chunksize):
		var k = 4*(j*chunksize + i)
		var frame = chunks[j][i]
		var fl = int(frame.x*32767)
		var fr = int(frame.y*32767)
		data[k] = (fl & 255)
		data[k+1] = ((fl >> 8) & 255)
		data[k+2] = (fr & 255)
		data[k+3] = ((fr >> 8) & 255)
recording.data = data

goatchurchprime requested a review from a team as a code owner December 17, 2024 13:01

adamscott added the topic:audio label Dec 17, 2024

adamscott added this to the 4.x milestone Dec 17, 2024

fire requested changes Dec 17, 2024

View reviewed changes

goatchurchprime requested a review from a team as a code owner December 17, 2024 15:01

goatchurchprime added a commit to goatchurchprime/two-voip-godot-4 that referenced this pull request Dec 18, 2024

using new hacked godot source code from godotengine/godot#100508

e213f56

adamscott self-requested a review December 18, 2024 01:59

AThousandShips added the enhancement label Dec 21, 2024

goatchurchprime mentioned this pull request Dec 23, 2024

Android microphone endless stutter bug goatchurchprime/two-voip-godot-4#15

Open

goatchurchprime mentioned this pull request Jan 3, 2025

Android: Ensure cleanup of all subobjects in the OpenSL audio driver #85955

Merged

goatchurchprime requested review from a team as code owners January 4, 2025 23:39

goatchurchprime mentioned this pull request Jan 17, 2025

[TRACKER] Audio Issues #76797

Open

This was referenced Jan 24, 2025

Microphone recording no longer works under Linux #81950

Closed

Stop AudioStreamPlayback only if it's not playing #101097

Merged

Audio Mic Record Demo fails to record when debug/settings/stdout/verbose_stdout=true on Linux #102013

Closed

goatchurchprime force-pushed the gtch/micplaybackmaster branch from 3427401 to 4b46f73 Compare January 26, 2025 21:47

expose useful functions from AudioStreamPlaybackMicrophone

f0c333b

goatchurchprime force-pushed the gtch/micplaybackmaster branch from 4b46f73 to f0c333b Compare January 26, 2025 22:13

This was referenced Jan 27, 2025

Fix Capture and Record AudioEffect bugs for surround systems #92532

Open

Duplicate jam-launch/godopus#1

Open

Merge branch 'godotengine:master' into gtch/micplaybackmaster

962fbad

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose functions from `AudioStreamPlaybackMicrophone` to access the microphone `AudioDriver::input_buffer` independently of the rest of the Audio system #100508

Expose functions from `AudioStreamPlaybackMicrophone` to access the microphone `AudioDriver::input_buffer` independently of the rest of the Audio system #100508

goatchurchprime commented Dec 17, 2024 •

edited by AThousandShips

Loading

RedMser commented Dec 17, 2024

fire left a comment

goatchurchprime commented Dec 17, 2024

goatchurchprime commented Dec 21, 2024

goatchurchprime commented Jan 3, 2025

walksanatora commented Jan 5, 2025

goatchurchprime commented Jan 5, 2025 •

edited

Loading

fire commented Jan 6, 2025

goatchurchprime commented Jan 9, 2025

goatchurchprime commented Jan 11, 2025

Expose functions from AudioStreamPlaybackMicrophone to access the microphone AudioDriver::input_buffer independently of the rest of the Audio system #100508

Are you sure you want to change the base?

Expose functions from AudioStreamPlaybackMicrophone to access the microphone AudioDriver::input_buffer independently of the rest of the Audio system #100508

Conversation

goatchurchprime commented Dec 17, 2024 • edited by AThousandShips Loading

RedMser commented Dec 17, 2024

fire left a comment

Choose a reason for hiding this comment

goatchurchprime commented Dec 17, 2024

goatchurchprime commented Dec 21, 2024

goatchurchprime commented Jan 3, 2025

walksanatora commented Jan 5, 2025

goatchurchprime commented Jan 5, 2025 • edited Loading

fire commented Jan 6, 2025

goatchurchprime commented Jan 9, 2025

goatchurchprime commented Jan 11, 2025

Expose functions from `AudioStreamPlaybackMicrophone` to access the microphone `AudioDriver::input_buffer` independently of the rest of the Audio system #100508

Expose functions from `AudioStreamPlaybackMicrophone` to access the microphone `AudioDriver::input_buffer` independently of the rest of the Audio system #100508

goatchurchprime commented Dec 17, 2024 •

edited by AThousandShips

Loading

goatchurchprime commented Jan 5, 2025 •

edited

Loading