Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MalformedError('No key could be detected.') When Using BigQuery Tool in LangGraph Cloud Deployment #3325

Open
4 tasks done
johannescastner opened this issue Feb 5, 2025 · 4 comments

Comments

@johannescastner
Copy link

Checked other resources

  • This is a bug, not a usage question. For questions, please use GitHub Discussions.
  • I added a clear and detailed title that summarizes the issue.
  • I read what a minimal reproducible example is (https://stackoverflow.com/help/minimal-reproducible-example).
  • I included a self-contained, minimal example that demonstrates the issue INCLUDING all the relevant imports. The code run AS IS to reproduce the issue.

Example Code

import os
import json
import asyncio
from typing import Type
import logging
# Core dependencies
from pydantic import BaseModel, Field
# Google Cloud
from google.cloud import bigquery
from google.oauth2 import service_account
# LangChain & LangGraph
from langchain_openai import ChatOpenAI
from langchain_core.tools import StructuredTool
from langgraph.prebuilt import create_react_agent

# Configure logging
logging.basicConfig(level=logging.INFO)

# CONFIGURATION
PROJECT_ID = os.getenv("PROJECT_ID", "datawarehouse-447422")
RAW_DATASET_ID = os.getenv("RAW_DATASET_ID", "linkedin_raw")
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")

# INPUT SCHEMA
class BigQueryListTablesInput(BaseModel):
    dataset_name: str = Field(..., description="Name of the BigQuery dataset to list tables from")

# BIGQUERY CLIENT INITIALIZATION
def get_bigquery_client() -> bigquery.Client:
    """Initialize BigQuery client with proper credentials"""
    if creds_json := os.getenv("GOOGLE_CLOUD_CREDENTIALS_JSON"):
        logging.info("Using service account credentials from environment variable.")
        credentials = service_account.Credentials.from_service_account_info(json.loads(creds_json))
        return bigquery.Client(credentials=credentials, project=credentials.project_id)
    logging.info("Using default project ID for BigQuery client.")
    return bigquery.Client(project=PROJECT_ID)

# TOOL IMPLEMENTATION
async def list_bigquery_tables(dataset_name: str) -> str:
    """List tables in a BigQuery dataset"""
    logging.info(f"Received dataset_name: {dataset_name}")
    if not dataset_name:
        raise ValueError("Missing required input: dataset_name")
    
    try:
        logging.info("Starting BigQuery client initialization...")
        client = get_bigquery_client()
        logging.info(f"BigQuery client initialized successfully with project ID: {client.project}")
        
        logging.info(f"Creating dataset reference for dataset: {dataset_name}")
        dataset_ref = client.dataset(dataset_name)
        logging.info(f"Dataset reference created: {dataset_ref.path}")
        
        logging.info("Listing tables in the dataset...")
        tables = client.list_tables(dataset_ref)
        table_ids = ", ".join(table.table_id for table in tables)
        logging.info(f"Table IDs: {table_ids}")
        
        return table_ids or "No tables found"
    except Exception as e:
        logging.error(f"Error listing tables: {e}")
        raise

# TOOL REGISTRATION
tools = [
    StructuredTool.from_function(
        func=list_bigquery_tables,
        name="list_bigquery_tables",
        description="Lists tables in a BigQuery dataset. Input: JSON object with 'dataset_name'",
        args_schema=BigQueryListTablesInput,
        coroutine=list_bigquery_tables,
    ),
]

# AGENT CREATION
def create_agent():
    llm = ChatOpenAI(
        model_name="gpt-3.5-turbo",
        temperature=0,
        max_tokens=1200,
        openai_api_key=OPENAI_API_KEY
    )
    return create_react_agent(llm, tools)

# Initialize the agent graph
graph = create_agent()

# Simulate agent flow locally (for testing purposes)
async def simulate_agent_flow():
    # Simulate input generation
    dataset_name = "linkedin_raw"
    
    # Test the tool
    print(await list_bigquery_tables(dataset_name))

# Run simulation locally
if __name__ == "__main__":
    asyncio.run(simulate_agent_flow())

Error Message and Stack Trace (if applicable)

Error listing tables: No key could be detected.

Description

I am encountering a persistent MalformedError('No key could be detected.') error when deploying an agent with a BigQuery tool (list_bigquery_tables) to LangGraph Cloud. The same code works flawlessly in a local environment, which suggests the issue lies within the LangGraph Cloud deployment or its interaction with external APIs like BigQuery.

Steps to Reproduce
Deploy the following minimal code to LangGraph Cloud.
Set the required secrets (GOOGLE_CLOUD_CREDENTIALS_JSON and OPENAI_API_KEY) in the LangGraph Cloud environment.
Trigger the list_bigquery_tables tool by asking the agent to list tables in the linkedin_raw dataset.

Expected Behavior
The tool should successfully list all tables in the specified dataset and return their names.

Actual Behavior
The tool fails with the following error:

Error listing tables: No key could be detected.
Troubleshooting Steps Taken
Validated Secrets : Confirmed that GOOGLE_CLOUD_CREDENTIALS_JSON and OPENAI_API_KEY are correctly set in LangGraph Cloud.
Tested Locally : Verified that the code works locally with the same service account credentials.
Added Logging : Enhanced logging to capture the entire execution flow, including project ID, dataset reference, and table listing.
Checked Permissions : Ensured the service account has the BigQuery Admin role.

Hypotheses
LangGraph Cloud Restrictions :
LangGraph Cloud might block or limit outbound HTTP requests to external APIs like BigQuery 4.
There could be constraints on the size or format of responses returned by tools.
Agent-Tool Integration :
The agent might mishandle the tool's response, leading to a malformed output 3.
There could be a mismatch between the expected and actual response formats.

Request for Assistance
Could the maintainers of LangGraph Cloud provide clarification on the following points?

Are there any restrictions on outbound HTTP requests to external APIs like BigQuery?
Are there specific requirements for tool response formats or schemas?
Could this issue be related to the runtime environment or permissions in LangGraph Cloud?
This example is fully self-contained, minimal, and reproducible. It includes all relevant imports, configurations, and logging to help diagnose the issue 1. Please let me know if further clarification is needed!

System Info

Python Version: 3.9+
Required Libraries: google-cloud-bigquery, langchain-openai, langchain-core, langgraph
Deployment Environment: LangGraph Cloud
Google Service Account Role: BigQuery Admin
OpenAI Model: gpt-3.5-turbo

@eyurtsev
Copy link
Contributor

eyurtsev commented Feb 5, 2025

@johannescastner the error message that suggests that the issue is that credentials cannot be found. It could be a bad error message from the google client, but could we dig into that first to confirm that this is really not the case?

Could you:

  1. Expose the rest of the stack trace so it's clear where the error is originating from.
  2. Perhaps remove this branch to make sure it's not accidentally triggered? return bigquery.Client(project=PROJECT_ID)

@johannescastner
Copy link
Author

Hi @eyurtsev (Eugene Yurtsev),

Thank you for your guidance so far. I’ve done additional debugging based on your suggestions, and here’s an update on what I’ve tried and observed:

Exposing the Full Stack Trace :
I added traceback.format_exc() to log the full stack trace when an exception occurs:

import traceback

except Exception as e:
    logging.error(f"Error listing tables: {e}")
    logging.error(f"Traceback: {traceback.format_exc()}")
    raise

However, despite this line being executed, the full stack trace is not appearing in the logs . The only error message logged is:

Error listing tables: No key could be detected.

This makes it difficult to pinpoint the exact location of the error.
Testing Base64 Encoding :
To rule out issues with special characters or escaping, I encoded the GOOGLE_CLOUD_CREDENTIALS_JSON secret in Base64 and decoded it in the code:

import base64
import json
import os

base64_encoded = os.getenv("BASE64_GOOGLE_CLOUD_CREDENTIALS_JSON")
creds_json = base64.b64decode(base64_encoded).decode("utf-8")
credentials = service_account.Credentials.from_service_account_info(json.loads(creds_json))

Unfortunately, this approach also failed with the same error (No key could be detected.). Additionally, the run could not be cancelled, and it kept consuming OpenAI credits until I had to delete the deployment to stop it 3.

Character Length of the JSON String :
The unencoded JSON string (including the private_key) has a length of 2,318 characters . I suspect that there might be a character length maximum for environment variables in the LangGraph Cloud GUI, which could be causing truncation or improper handling of the secret.
Could you confirm if there is a character limit for environment variables in LangGraph Cloud? If so, this could explain why the same JSON string works locally but fails in the deployment.
Local Testing Works Fine :
The exact same JSON string works perfectly in a local environment with the following test function:

def test_list_tables(dataset_name):
    client = get_bigquery_client()
    dataset_ref = client.dataset(dataset_name)
    tables = client.list_tables(dataset_ref)
    print(", ".join(table.table_id for table in tables) or "No tables found")

This strongly suggests that the issue lies with the deployment environment rather than the Google Cloud credentials themselves.
Hypothesis :
Since the JSON string works locally but fails in the deployment, it seems likely that the issue is related to how the secret is handled in the LangGraph Cloud GUI. Possible culprits include:
A character length limit for environment variables.
Improper handling of special characters (e.g., \n in the private_key).
Deployment-specific restrictions on secrets or external API interactions.
Could you please provide clarification on the following points?

Is there a character length maximum for environment variables in LangGraph Cloud?
Are there any known issues with handling long secrets or special characters in the GUI?
Could this be related to deployment-specific restrictions on secrets or external API interactions?
Thank you for your help, and please let me know if you need any additional information.

@andrewnguonly
Copy link
Contributor

Is there a character length maximum for environment variables in LangGraph Cloud?

There's no specific character length maximum for an individual environment variable. However, the entire secrets data and metadata cannot exceed 1 MB.

Are there any known issues with handling long secrets or special characters in the GUI?

No, no known issues.

Could this be related to deployment-specific restrictions on secrets or external API interactions?

No, not that I know of, but let's hold on this possibility for now.

Since the JSON string works locally but fails in the deployment, it seems likely that the issue is related to how the secret is handled in the LangGraph Cloud GUI. Possible culprits include:
A character length limit for environment variables.

As a a test, put in some bogus value for GOOGLE_CLOUD_CREDENTIALS_JSON, but keep the same formatting of what the JSON should look like. Print out the value after reading it from the environment variable and see if it matches what was input in the UI. This will rule in/out formatting issues with the value. You can also print out the string length (e.g. print(len(creds_json))), just to see. I'll also test this out.

Improper handling of special characters (e.g., \n in the private_key).

Test this out with a smaller string. Print out results. See if they match. I'll also test this out.

@andrewnguonly
Copy link
Contributor

Improper handling of special characters (e.g., \n in the private_key).

Test this out with a smaller string. Print out results. See if they match. I'll also test this out.

I just tested this and the \n character is preserved correctly. I also tested with an escaped new line character (\\n).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants