[Bug]: TokenCountingHandler remains 0 #12214

BastiaanRudolf · 2024-03-24T14:05:15Z

Bug Description

Hi! Love this project, and it's a blessing to work with.

I ran into a small problem. When implementing the new TokenCountingHandler through the new global Setting, the tokens remain 0. It does not return any warning or error.

What am I missing? Any help is much appreciated 🙏

Version

0.10.23

Steps to Reproduce

import openai
import tiktoken
from llama_index.llms.openai import OpenAI

from llama_index.core import Settings, VectorStoreIndex, SimpleDirectoryReader
from llama_index.core.memory import ChatMemoryBuffer
from llama_index.core.callbacks import CallbackManager, TokenCountingHandler
from llama_index.legacy.indices.empty import EmptyIndex
from .utils import create_chat_message, update_chat_message

openai.api_key = os.environ["OPENAI_API_KEY"]
token_counter = TokenCountingHandler(
    tokenizer=tiktoken.encoding_for_model("gpt-3.5-turbo").encode,
    verbose=True,  # set to true to see usage printed to the console
)
Settings.llm = OpenAI(model="gpt-3.5-turbo", temperature=0.7)
Settings.callback_manager = CallbackManager([token_counter])

memory = ChatMemoryBuffer.from_defaults(token_limit=1500)

# Index
index = EmptyIndex()
engine = index.as_chat_engine(
    chat_mode="context",
    memory=memory,
    verbose=True
)

response = engine.stream_chat(
    message="A random message",
)

# Generate response & store in message
resp_string = ""
for resp in response.response_gen:
    resp_string += resp

# Get tokens & reset counter
update_chat_message(
    id=chat_message_responder_id,
    chat=chat,
    duration=time.time() - start,
    total_tokens=token_counter.total_llm_token_count,
    embedding_tokens=token_counter.total_embedding_token_count,
    prompt_tokens=token_counter.prompt_llm_token_count,
    completion_tokens=token_counter.completion_llm_token_count
)
token_counter.reset_counts()

Relevant Logs/Tracbacks

No response

The text was updated successfully, but these errors were encountered:

dosubot · 2024-03-24T14:05:29Z

Based on the information you've provided and the context from the LlamaIndex repository, it seems like there might be an issue with how the Settings.callback_manager is being set or how the TokenCountingHandler is being used in version 0.10.23 of the LlamaIndex project.

The TokenCountingHandler is designed to track and count tokens for different types of events, specifically for LLM (Language Learning Models) and embedding events. It maintains separate lists to track token counts for LLM and embedding events (llm_token_counts and embedding_token_counts, respectively). Each time an event ends, it adds a record to the respective list, indicating the total token count for that event. The handler also keeps a running total of the token counts for LLM and embedding events separately (total_llm_token_count and total_embedding_token_count). These totals are updated as events end, reflecting the cumulative token counts across all processed events of each type. You can find more details about this in the test_token_counter.py file.

However, if the Settings or the CallbackManager implementation expects a specific initialization process or if there's a need for the Settings to be explicitly saved or updated after modifying its properties, simply assigning a new CallbackManager to Settings.callback_manager might not trigger the necessary updates or hooks within the LlamaIndex framework to recognize and utilize the new callback manager properly.

To troubleshoot this issue, consider the following steps:

Review the release notes or documentation for version 0.10.23 of the LlamaIndex project to identify any changes related to the Settings or callback system.
Ensure that after setting the Settings.callback_manager, the settings are correctly applied or saved as required by the LlamaIndex framework.
Check if the CallbackManager or TokenCountingHandler requires any additional configuration or initialization steps in the new version.
Look into the source code or documentation for any deprecated methods or properties that might affect the token counting functionality.

Unfortunately, I wasn't able to find specific changes made to the TokenCountingHandler or the Settings and CallbackManager classes in the LlamaIndex project between version 0.10.22 and 0.10.23. Also, I couldn't find any known bugs or issues related to the TokenCountingHandler class in the LlamaIndex project. However, there might be more information available elsewhere or I could have missed it.

If you could provide more details about the behavior you're experiencing or any additional context, it would be helpful in further diagnosing the issue. For example, are there any specific conditions or steps that lead to the TokenCountingHandler not updating the token count as expected? Are there any other parts of your code that interact with the Settings or CallbackManager that might affect their behavior? Any additional information you can provide would be greatly appreciated.

Sources

About Dosu

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

logan-markewich · 2024-03-25T01:25:43Z

its kind of jank --- try attaching it directly to the LLM too

Settings.llm = OpenAI(model="gpt-3.5-turbo", temperature=0.7, callback_manager=callback_manager)

BastiaanRudolf · 2024-03-25T08:34:26Z

Hi @logan-markewich, thanks for your reply!

I changed to the following:

token_counter = TokenCountingHandler(
    tokenizer=tiktoken.encoding_for_model("gpt-3.5-turbo").encode,
    verbose=True,  # set to true to see usage printed to the console
)
callback_manager = CallbackManager([token_counter])
Settings.callback_manager = callback_manager
Settings.llm = OpenAI(model="gpt-3.5-turbo", temperature=0.7, callback_manager=callback_manager)

Unfortunately, no luck..

logan-markewich · 2024-03-25T18:06:27Z

So weird.

OK one more thing

index.as_chat_engine(...., callback_manger=callback_manager)

BastiaanRudolf · 2024-03-26T19:09:09Z

Thanks for the help, I tried, and unfortunately still all zeroes! Very weird behaviour.. I'm now using the legacy ServiceContext object to make it work.

andreyka26-git · 2024-06-16T13:42:03Z

Any progress on it? I am having the same problem

MisteFr · 2024-06-19T07:55:52Z

Same issue here.

GillesJ · 2024-09-04T15:45:24Z

Same issue, my first invocation will show a zero count, subsequent calls will be correctly updated to the token count have counted fine.

BastiaanRudolf added bug Something isn't working triage Issue needs to be triaged/prioritized labels Mar 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: TokenCountingHandler remains 0 #12214

[Bug]: TokenCountingHandler remains 0 #12214

BastiaanRudolf commented Mar 24, 2024

dosubot bot commented Mar 24, 2024 •

edited

Loading

About Dosu

logan-markewich commented Mar 25, 2024

BastiaanRudolf commented Mar 25, 2024

logan-markewich commented Mar 25, 2024

BastiaanRudolf commented Mar 26, 2024

andreyka26-git commented Jun 16, 2024

MisteFr commented Jun 19, 2024

GillesJ commented Sep 4, 2024 •

edited

Loading

[Bug]: TokenCountingHandler remains 0 #12214

[Bug]: TokenCountingHandler remains 0 #12214

Comments

BastiaanRudolf commented Mar 24, 2024

Bug Description

Version

Steps to Reproduce

Relevant Logs/Tracbacks

dosubot bot commented Mar 24, 2024 • edited Loading

Sources

About Dosu

logan-markewich commented Mar 25, 2024

BastiaanRudolf commented Mar 25, 2024

logan-markewich commented Mar 25, 2024

BastiaanRudolf commented Mar 26, 2024

andreyka26-git commented Jun 16, 2024

MisteFr commented Jun 19, 2024

GillesJ commented Sep 4, 2024 • edited Loading

dosubot bot commented Mar 24, 2024 •

edited

Loading

GillesJ commented Sep 4, 2024 •

edited

Loading