Skip to content

Latest commit

 

History

History
69 lines (51 loc) · 2.4 KB

README.md

File metadata and controls

69 lines (51 loc) · 2.4 KB

articulus_divisio

Tandem Approaches

Classification of multiple text datasets using various algorithms, including

  • Kmeans
  • Agglomerative Clustering
    • Ward
    • Complete
    • Single
    • Average
  • HDBSCAN
  • Spectral Clustering
  • Gaussian Mixtures

The classification is based on multiple work representations including:

  • Word2Vec
  • GloVe
  • BERT
  • ROBERTA

Represented in various spaces by using dimensionality reduction techniques including:

  • PCA
  • t-SNE
  • UMAP
  • Simple Autoencoder

First Submission:

Labeled Data (Classic4 and BBC)

Open In Colab

Second Submission

Labeled Data (Classic4 and BBC)

Open In Colab

Unlabeled Data

Articles1

Open In Colab

Articles2

Open In Colab

Simultaneous Dimensionality Reduction and Classification

Classification of multiple text datasets using various algorithms, including

  • Reduced k-means et Factorial k-means
  • Deep Clustering Network (DCN)
  • Deep k-means (DKM)

The classification is based on multiple work representations including:

  • Word2Vec
  • GloVe
  • BERT
  • ROBERTA

First Submission:

Labeled Data (Classic4 and BBC)

Open In Colab

Second Submission:

Labeled Data (Classic4 and BBC)

Open In Colab

Unlabeled Data (Articles1 and Articles2)

Open In Colab