Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can we do topic modeling? #5

Open
dgarijo opened this issue Jul 7, 2020 · 0 comments
Open

Can we do topic modeling? #5

dgarijo opened this issue Jul 7, 2020 · 0 comments
Labels
enhancement New feature or request

Comments

@dgarijo
Copy link
Contributor

dgarijo commented Jul 7, 2020

Use case:
based on a software, which other software it is more related to?

How is this done?
1- Calculate topics for corpus based on description (e.g., based on Latent Dirichlet Allocation distance)
2- For each topic, you have the probability of a document to belong to that topic, creating clusters of software.
3- Having a new query (in this case a series of keywords), you would calculate which cluster they are more similar to.

We can also define a metric based on graph similarity (to explore)

@dgarijo dgarijo added the enhancement New feature or request label Jul 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant