Skip to content

jyzhang111/Holocron--Shareable_Code

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 

Repository files navigation

Has the capability of producing topic modeling as well as search functionality to obtain similar model embeddings. main.py produces embeddings given a database of abstracts, and produces topic modeling output. search.py implements the ability to search and return related topics for a search query. Due to main.py producing embeddings before search.py, it must be run before search.py. To use these functions, you must have a database with a table and the names of the corresponding id and abstract text columns, weaviate cluster credentials, as well as the class_name of the weaviate cluster, and a list of column names that you want to include in the 3d output. As well, if it does not work, you may need new google translate credentials in application_default_credentials.json.

To use, pip install -e database_topic_modeling

Example usage surrounding main.py, see function comments for more details:

import mysql.connector
from database_topic_modeling.main import produce_cluster_graphs_and_3d_output

conn = mysql.connector.connect(
    host="35.245.30.192",
    user="root",
    password="xTG1ghg7ZlQvRzr66iYG",
    database="test"
)

produce_cluster_graphs_and_3d_output(
    conn,
    tablename="article_cn",
    id_column_name="id",
    text_column_name="text",
    weaviate_url="https://database-cluster-e8lalwdl.weaviate.network",
    weaviate_auth_api_key="BoyZ0LFzJvR38BW8ZJLP5A40bHfs2e51j7bt",
    class_name="Database",
    custom_data=["title", "source", "source_date"]
)

Example usage surrounding search.py, see function comments for more details:

from database_topic_modeling.search import search_concept_for_output

inp = input('What concept would you like to search? \n')
search_concept_for_output(
    inp,
    weaviate_url="https://database-cluster-e8lalwdl.weaviate.network",
    weaviate_auth_api_key="BoyZ0LFzJvR38BW8ZJLP5A40bHfs2e51j7bt",
    class_name="Database",
    id_column_name="id",
    custom_data=["title", "source", "source_date"],
)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published