Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot see documents chart #2279

Closed
1 task done
JvdsReform opened this issue Jan 31, 2025 · 2 comments
Closed
1 task done

Cannot see documents chart #2279

JvdsReform opened this issue Jan 31, 2025 · 2 comments
Labels
bug Something isn't working

Comments

@JvdsReform
Copy link

JvdsReform commented Jan 31, 2025

Have you searched existing issues? 🔎

  • I have searched and found no existing issues

Desribe the bug

I tried creating a topic model and vizualizing it with topic_model.visualize_documents. At first it worked. I downgraded to Python 3.11.9 recently (from Python 3.12) and now it doesn't show the axis or the embeddings in the plot. Is there some mismatch of packages that I'm missing?

Image

Looking at the plotly figure it creates, it contains the coordinates of all embeddings like normal. The topics frequency is also correct. When opening the chrome dev console, I see the following error:

(index):2635 Uncaught TypeError: POt.indexOf is not a function
    at Object.bqe [as parse] ((index):2635:6608)
    at (index):2712:2021
    at Array.forEach (<anonymous>)
    at Vu.update ((index):2712:1938)
    at new Vu ((index):2638:59901)
    at bOe.exports [as plot] ((index):2712:10629)
    at Kve ((index):25:370736)
    at yv.plot ((index):25:369762)
    at ld.drawData ((index):25:180565)
    at li.syncOrAsync ((index):24:17595)

It seems to be from the plotly package but I cannot seem to pinpoint the exact problem since the code is minified:

Image

Reproduction

Code in question where I create the model:

bi_encoder = SentenceTransformer("all-mpnet-base-v2")
embeddings = bi_encoder.encode(input_corpus, show_progress_bar=True)

umap_model = UMAP(n_neighbors=3, n_components=3, min_dist=0.1, metric="cosine", random_state=40)
hdbscan_model = HDBSCAN(prediction_data=True)
vectorizer_model = CountVectorizer(ngram_range=(1, 2), stop_words="english")

topic_model = BERTopic(
    embedding_model=bi_encoder,
    vectorizer_model=vectorizer_model,
    hdbscan_model=hdbscan_model,
    umap_model=umap_model,
    verbose=True,
)

fig_docs = topic_model.visualize_documents(red_flag_descr, embeddings=embeddings)
fig_docs.show()

Versions:

bertopic==0.16.4
umap-learn==0.5.7
hdbscan==0.8.40
sentence-transformers==3.3.1
plotly==6.0.0

BERTopic Version

0.16.4

@JvdsReform JvdsReform added the bug Something isn't working label Jan 31, 2025
@MaartenGr
Copy link
Owner

Hmmm, this might be related to the number of topics that you have. Could you try it again with more than 2 topics?

Also, it seems that plotly 6.0.0 was released just last week, so that might have something to do with it. I believe this issue might be related. Could you check if upgrading narwhals would fix this issue?

@JvdsReform
Copy link
Author

JvdsReform commented Feb 3, 2025

@MaartenGr Upgrading narwhals did not work. I currently use version 1.25.0.

EDIT: Downgrading plotly solved the issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants