Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Research Topic prediction #282

Merged
merged 1 commit into from
Oct 12, 2024
Merged

Conversation

ramana2074
Copy link
Contributor

Related Issues or bug

  • Research prediction, predict the topics for each article included in the test set.

Fixes: #245

Proposed Changes

  • Info about Changes:
    Cleaned the data by removing single-letter words and non-alphabetic characters.
    Applied better handling for cases where the abstract or title is too short by adding minimum text length validation.
    Refined the vectorization process to remove irrelevant tokens and improve classification accuracy.
    Adjusted the model hyperparameters for better generalization across multiple topics.

Additional Info

  • This fix also addresses part of the classification imbalance issue where certain categories (e.g., Quantitative Biology, Quantitative Finance) had significantly fewer articles, leading to lower accuracy for these categories. Future improvements could focus on addressing this imbalance more effectively by incorporating techniques like oversampling or data augmentation.

Copy link

👋 Thank you for opening this pull request! We appreciate your contribution to improving this project. Your PR is under review, and we'll get back to you shortly.
Don't forget to mention the issue you solved!.

To help move the process along, please tag @UppuluriKalyani, @Neilblaze, and @SaiNivedh26 for a faster review!

@UppuluriKalyani UppuluriKalyani merged commit 3e28427 into UppuluriKalyani:main Oct 12, 2024
3 checks passed
Copy link

🎉🎉 Thank you for your contribution! Your PR #282 has been merged! 🎉🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Research-topic-Prediction
2 participants