This repository contains a module for query-focused summarization of discussion threads in the DISCOSUMO project. It takes two input arguments:
- a directory of threads (XML format is described here)
- a tab-separated file of the format
query,clicked title,threadid
It implements Maximal Marginal Relevance (MMR) for a query-thread pair, and scores the posts in a thread using MMR.
The input format is highly specific for the DISCOSUMO project but the implementation of MMR might be reusable in other contexts.
See the LICENSE file for license rights and limitations (GNU-GPL v3.0).