Skip to content
This repository has been archived by the owner on Jan 23, 2024. It is now read-only.
/ man Public archive

Multinomial Adversarial Networks for Multi-Domain Text Classification (NAACL 2018)

License

Notifications You must be signed in to change notification settings

ccsasuke/man

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Multinomial Adversarial Nets

This repo contains the source code for our paper:

Multinomial Adversarial Networks for Multi-Domain Text Classification
Xilun Chen, Claire Cardie
NAACL-HLT 2018
paper, bibtex

Requirements:

  • Python 3.6
  • PyTorch 0.4
  • PyTorchNet
  • scipy
  • tqdm (for progress bar)

An Anaconda environment.yml file is also provided.

Dev version

The dev branch contains a full version of our code, including some options we experimented but did not include in the final version.

Follow-up Paper

You might also be interested in our follow-up ACL 2019 paper (source code available here) with an improved model and better performance.

Before Running

The pre-trained word embeddings file exceeds the 100MB limit of github, and is thus provided as a gzipped tar ball. Please run the following command to extract it first:

tar -xvf data/w2v/word2vec.tar.gz -C data/w2v/

Experiment 1: MDTC on the multi-domain Amazon dataset

cd code/
python train_man_exp1.py --dataset prep-amazon --model mlp

Experiment 2: Multi-Source Domain Adaptation

cd code/
# target domain: books
python train_man_exp2.py --dataset prep-amazon --model mlp --no_wgan_trick --domains dvd electronics kitchen --unlabeled_domains books --dev_domains books
# target domain: dvd
python train_man_exp2.py --dataset prep-amazon --model mlp --no_wgan_trick --domains books electronics kitchen --unlabeled_domains dvd --dev_domains dvd
# target domain: electronics
python train_man_exp2.py --dataset prep-amazon --model mlp --no_wgan_trick --domains books dvd kitchen --unlabeled_domains electronics --dev_domains electronics
# target domain: kitchen
python train_man_exp2.py --dataset prep-amazon --model mlp --no_wgan_trick --domains dvd electronics kitchen --unlabeled_domains kitchen --dev_domains kitchen

Experiment 3: MDTC on the FDU-MTL dataset

cd code/
python train_man_exp3.py --dataset fdu-mtl --model cnn --max_epoch 50

A larger batch size can also be used to reduce the training time.

Releases

No releases published

Packages

No packages published

Languages