Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

voice_client receiving voice errors / not working ? #2644

Open
3 tasks done
borick opened this issue Nov 10, 2024 · 4 comments
Open
3 tasks done

voice_client receiving voice errors / not working ? #2644

borick opened this issue Nov 10, 2024 · 4 comments
Labels
unconfirmed bug A bug report that needs triaging

Comments

@borick
Copy link

borick commented Nov 10, 2024

Summary

trying to use voice_client getting errors

Reproduction Steps

talk to bot using discord voice and it errors out for me i'm not sure why, i'm trying to get the voice to respond to voice automatically without having to start a command is this possible?

Minimal Reproducible Code

import asyncio
import discord
from discord.ext import commands
from dotenv import load_dotenv
from os import environ
from deepgram import DeepgramClient, PrerecordedOptions, FileSource
from PyCharacterAI import get_client
import tempfile
import logging
import numpy as np

logging.basicConfig(level=logging.DEBUG)

intents = discord.Intents.default()
intents.message_content = True
intents.guilds = True
intents.voice_states = True

bot = commands.Bot(command_prefix="!", intents=intents)
connections = {}
load_dotenv()

deepgram = DeepgramClient(environ.get("DEEPGRAM_API_TOKEN"))

character_ai_token = environ.get("CHARACTER_AI_TOKEN")

options = PrerecordedOptions(
    model="nova-2",
    smart_format=True,
    utterances=True,
    punctuate=True,
    diarize=True,
    detect_language=True,
)

class SilenceDetectingSink(discord.sinks.WaveSink):
    def __init__(self, threshold=-40, **kwargs):
        super().__init__(**kwargs)
        self.threshold = threshold
        self.is_silent = True

    async def process_audio(self, audio_data):
        if self.is_silent:
            if np.mean(audio_data) > self.threshold:
                self.is_silent = False
                self.start_recording()
        else:
            if np.mean(audio_data) < self.threshold:
                self.is_silent = True
                self.stop_recording()
                self.emit("silence_detected")
        await super().process_audio(audio_data)

@bot.event
async def on_ready():
    logging.info(f"Bot is ready and logged in as {bot.user}")

@bot.event
async def on_guild_join(guild):
    logging.info(f'Joined new guild: {guild.name} (id: {guild.id})')

@bot.event
async def on_command_error(ctx, error):
    if isinstance(error, commands.CommandNotFound):
        await ctx.send("Command not found. Please check the available commands.")
    else:
        await ctx.send(f"An error occurred: {str(error)}")
        logging.error(f"An error occurred: {str(error)}")

@bot.command()
async def join(ctx):
    if not ctx.author.voice:
        await ctx.send("⚠️ You aren't in a voice channel!")
        return

    channel = ctx.author.voice.channel
    try:
        vc = await channel.connect()
        connections.update({ctx.guild.id: vc})
        
        vc.start_recording(
            SilenceDetectingSink(threshold=-40),
            once_done,
            ctx.channel,
        )
        
        await ctx.send("🔊 Joined the voice channel. Monitoring conversation...")
        logging.info("Started recording")
        
    except Exception as e:
        await ctx.send(f"Error joining voice channel: {str(e)}")
        logging.error(f"Error joining voice channel: {str(e)}")

async def once_done(sink: discord.sinks, channel: discord.TextChannel, *args):
    logging.info("Recording completed")
    recorded_users = [f"<@{user_id}>" for user_id, audio in sink.audio_data.items()]
    words_list = []

    for user_id, audio in sink.audio_data.items():
        with tempfile.NamedTemporaryFile(delete=False) as tmpfile:
            tmpfile.write(audio.file.read())
            tmpfile_path = tmpfile.name

        payload = FileSource(buffer=audio.file.read())
        response = await deepgram.listen.prerecorded.v("1").transcribe_file(payload, options)
        words = response["results"]["channels"][0]["alternatives"][0]["words"]
        words = [word.to_dict() for word in words]

        for word in words:
            if word["speaker"] != 0:
                user_id = word["speaker"]

            new_word = {
                "word": word["word"],
                "start": word["start"],
                "end": word["end"],
                "confidence": word["confidence"],
                "punctuated_word": word["punctuated_word"],
                "speaker": user_id,
                "speaker_confidence": word["speaker_confidence"],
            }
            words_list.append(new_word)

    words_list.sort(key=lambda x: x["start"])

    transcript = ""
    current_speaker = None

    for word in words_list:
        if "speaker" in word and word["speaker"] != current_speaker:
            transcript += f"\n\nSpeaker <@{word['speaker']}>: "
            current_speaker = word["speaker"]
        transcript += f"{word['punctuated_word']} "

    transcript is transcript.strip()
    await channel.send(f"Finished recording audio for: {', '.join(recorded_users)}. Here is the transcript: \n\n{transcript}")

    logging.info("Transcript created and sent")

@bot.command()
async def leave(ctx):
    if ctx.guild.id in connections:
        vc = connections[ctx.guild.id]
        await vc.disconnect()
        del connections[ctx.guild.id]
        await ctx.send("🚪 Left the voice channel.")
    else:
        await ctx.send("⚠️ I'm not in a voice channel.")

@bot.command()
async def stop_recording(ctx):
    if ctx.guild.id in connections:
        vc = connections[ctx.guild.id]
        vc.stop_recording()
        await ctx.send("🔴 Stopped recording.")
    else:
        await ctx.send("🚫 Not recording here")

print("Bot is starting...")
bot.run(environ.get("DISCORD_BOT_TOKEN"))

Expected Results

it works, responds to voice

Actual Results

errors out , or does nothing

Intents

the default

System Information

intents = discord.Intents.default()
intents.message_content = True
intents.guilds = True
intents.voice_states = True

Checklist

  • I have searched the open issues for duplicates.
  • I have shown the entire traceback, if possible.
  • I have removed my token from display, if visible.

Additional Context

n/a

@borick borick added the unconfirmed bug A bug report that needs triaging label Nov 10, 2024
@borick borick changed the title voice_client receiving voice errors voice_client receiving voice errors / not working ? Nov 10, 2024
@Paillat-dev
Copy link
Contributor

@borick Could you please share the full error traceback ?

@thebigsleepjoe
Copy link

I have a similar error that's been plaguing me as well. I believe it's the same thing @borick is experiencing. It always happens every 3-20 seconds when receiving audio. I've been going crazy on this one, turning on/off various snippets in my own code to no avail.

I'll attach my code as well for propriety--as it's only five small files--but I think this is an issue with the library itself, as my portion is never referenced in the call stack + keeps occurring no matter what I've tried.

py.tar.gz

Exception

Exception in thread Thread-3 (recv_audio):
Traceback (most recent call last):
  File "/usr/lib/python3.12/threading.py", line 1075, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.12/threading.py", line 1012, in run
    self._target(*self._args, **self._kwargs)
  File "/mnt/gigadrive/Scripts/discord-talker/.venv/lib/python3.12/site-packages/discord/voice_client.py", line 863, in recv_audio
    self.unpack_audio(data)
  File "/mnt/gigadrive/Scripts/discord-talker/.venv/lib/python3.12/site-packages/discord/voice_client.py", line 740, in unpack_audio
    data = RawData(data, self)
           ^^^^^^^^^^^^^^^^^^^
  File "/mnt/gigadrive/Scripts/discord-talker/.venv/lib/python3.12/site-packages/discord/sinks/core.py", line 115, in __init__
    self.decrypted_data = getattr(self.client, f"_decrypt_{self.client.mode}")(
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/gigadrive/Scripts/discord-talker/.venv/lib/python3.12/site-packages/discord/voice_client.py", line 611, in _decrypt_xsalsa20_poly1305_lite
    return self.strip_header_ext(box.decrypt(bytes(data), bytes(nonce)))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/gigadrive/Scripts/discord-talker/.venv/lib/python3.12/site-packages/discord/voice_client.py", line 615, in strip_header_ext
    if data[0] == 0xBE and data[1] == 0xDE and len(data) > 4:
       ~~~~^^^
IndexError: index out of range

@AboveAphid
Copy link

AboveAphid commented Dec 6, 2024

By my testing it seems to be when the audio is short, and nothing is being said / no actual sound is occurring. (keep in mind I'm using MP3Sink not WAVSink / your custom SilenceDetectionSink)

I'm currently just modify the code snippet and putting it in a try-except block for Index errors but this is a really messy solution so hopefully we get a fix soon.

@staticmethod
def strip_header_ext(data):
    try:
        if data[0] == 0xBE and data[1] == 0xDE and len(data) > 4:
            _, length = struct.unpack_from(">HH", data)
            offset = 4 + length * 4
            data = data[offset:]
    except IndexError as e:
        print(f"The IndexError occurred but we will ignore it! Just means this part of the data probably is 0 bytes.. Data: {data}")

    return data

@KuraiAI
Copy link

KuraiAI commented Dec 31, 2024

I want to mention I'm also getting this error. I wasn't getting it in the past, but since downgrading from Windows 10 to Windows 11 with a new PC, I started getting this error when I have multiple people speaking at the same time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
unconfirmed bug A bug report that needs triaging
Projects
None yet
Development

No branches or pull requests

5 participants