Sungjoon Park, Kiwoong Park, Jaimeen Ahn, Alice Oh
To appear in 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020)
We analyze social media for detecting suicide risk of military personnel, which is especially important for countries with compulsory military service such as South Korea. From a widely-used Korean social Q&A site, we collect posts containing military-relevant content written by active-duty military personnel. We then annotate the posts with two groups of experts: military experts and mental health experts. Our dataset contains 2,791 posts with 13,955 corresponding expert annotations of suicidal risk levels, and this dataset is available to researchers who consent to research ethics agreement. Using various fine-tuned state-of-the art language models, we predict the level of suicide risk, reaching .88 F1 score for classifying the risks.