GitHub - zkharryhhhh/DMESSM: Short Text Clustering with A Deep Multi-Embedded Self-Supervised Model

DMESSM: Short Text Clustering with A Deep Multi-Embedded Self-Supervised Model

This repository is an implementation of "Short Text Clustering with A Deep Multi-Embedded Self-Supervised Model ". The implementation is based DEC-keras and SIFAuto.

Install requirements

conda install --yes --file requirements.txt

Data

We release the data of stackoverflow now. The word2vec embedding is from STCC . The Sbert embedding is calculated by us , shown in stackoverflow.npy.

We use four datasets, which are stackoverflow, SerchSnippets, Tweet89 and 20ngnewsshort. Our data including different embeddings will be released.

Run an example

python DMESSM.py --dataset stackoverflow -- maxiter 2600 --ae_weights data/stackoverflow/results/ae_weights.hs --save_dir data/stackoverflow/results

Important notes

We release the complete code!

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.idea		.idea
data/stackoverflow		data/stackoverflow
DMESSM.py		DMESSM.py
README.md		README.md
Tfidf.py		Tfidf.py
__init__.py		__init__.py
initicompare.py		initicompare.py
metrics.py		metrics.py
multiEmbedding.py		multiEmbedding.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DMESSM: Short Text Clustering with A Deep Multi-Embedded Self-Supervised Model

Install requirements

Data

Run an example

Important notes

About

Releases

Packages

Languages

zkharryhhhh/DMESSM

Folders and files

Latest commit

History

Repository files navigation

DMESSM: Short Text Clustering with A Deep Multi-Embedded Self-Supervised Model

Install requirements

Data

Run an example

Important notes

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages