Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EMNLP, 2022, Leveraging Only the Category Name for Aspect Detection through Prompt-based Constrained Clustering #92

Open
Sepideh-Ahmadian opened this issue Sep 27, 2024 · 2 comments
Assignees
Labels
literature-review Summary of the paper related to the work

Comments

@Sepideh-Ahmadian
Copy link
Member

Paper
Leveraging Only the Category Name for Aspect Detection through Prompt-based Constrained Clustering

Introduction
This paper proposes a method for Aspect Category Detection (ACD) using a prompt-based constrained clustering approach. ACD aims to identify aspects from online product reviews to analyze user sentiment toward specific product features. The proposed method, called PCCT (Prompt-based Constrained Clustering Technique), uses only the aspect category name and a pre-trained language model (LM) to extract meaningful keywords and cluster the reviews into predefined categories.

Main Problem
The main problem addressed is to minimize human supervision. Traditional methods require significant human effort for annotation, and unsupervised methods often perform poorly. The paper proposes a method that uses only the aspect category name to guide the clustering of review segments into relevant categories.

Illustrative Example
An illustrative example in the paper demonstrates the clustering of CitySearch restaurant reviews are into categories such as "Food," "Ambience," and "Staff." For example:
Category: Food → Keywords: "fish," "salad," "fruit," etc.
Category: Ambience → Keywords: "music," "atmosphere," "decor," etc.​

Input
Unlabeled review corpus and predefined aspect category names.

Output
Clustered reviews categorized into predefined aspects based on the generated aspect vocabularies.

Motivation
The authors were motivated by the challenges in ACD, which traditionally requires manual annotation or expert knowledge. They aimed to minimize human involvement and leverage pretrained language models to improve detection performance without significant manual effort.

Related works and their gaps
The paper addresses the gap between unsupervised and weakly-supervised methods for ACD, which often require expert-annotated data or fail to deliver high performance (Brody and Elhadad, 2010; Chen et al., 2014; Pablos et al., 2018; Zheng et al., 2020; Özyurt and Akcayol, 2021, Chen et al., 2016; Xiong and Ji, 2016; Zhao et al., 2014)(LDA and traditional clustering algorithms).
The following works require hand-crafted mapping.
(He et al., 2017; Luo et al., 2019; Shi et al., 2021; Chebolu et al., 2022, Angelidis and Lapata, 2018a; Huang et al., 2020; Karamanolakis et al., 2019, He et al., 2017, Luo et al., 2019; Shi et al., 2021)
The authors propose a method that reduces reliance on expert input and improves clustering performance using prompt-based learning.

Contribution of this paper
Proposing PCCT, a deep-constrained clustering method that uses only aspect category names. They have introduced a novel method that provides more information for clustering constraints. Their methods outperform the previously proposed unsupervised and weakly-supervised methods.

Proposed methods
Not included

Experiments
The model is evaluated on nine datasets, including:
Semeval (Restaurant, Laptop): For aspect detection in product reviews.
CitySearch: For aspect detection in restaurant reviews.
OPOSum: A product review dataset with domains like Bags, Bluetooth, Boots, Keyboards, TVs, and Vacuums.

Implementation
https://github.com/liyazheng/PCCT.

Gaps this work
The model relies heavily on predefined category names, which may not capture complex review semantics in certain domains. In addition to that, the approach has only been tested on ACD and not extended to sentiment analysis.

@Sepideh-Ahmadian Sepideh-Ahmadian added the literature-review Summary of the paper related to the work label Sep 27, 2024
@Sepideh-Ahmadian Sepideh-Ahmadian self-assigned this Sep 27, 2024
@hosseinfani
Copy link
Member

@Sepideh-Ahmadian
Can we add CitySearch and OPOSum to LADy?

@Sepideh-Ahmadian
Copy link
Member Author

@hosseinfani sure, in the next phase we can evaluate these datasets for adding to lady. Based on the student reports there are other resources such as Restaurant-ACOS and Laptop-ACOS

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
literature-review Summary of the paper related to the work
Projects
None yet
Development

No branches or pull requests

2 participants