
Extractive summarization of discussion forum threads

In this talk, I address extractive summarization for long threads in online discussion fora. We have studied the inter-rater disagreement between human judges on this summarization task, and the implications of this disagreement for automatic summarization. In a user study, we showed long threads to 10 different raters and asked them to select the posts that they considered to be the most important for the thread. With the human-labelled data, we trained a model for automatic extractive summarization. We found that although the inter-rater agreement for the summarization task was low, the automatic summarizer obtained fair results in terms of precision and recall. Moreover, in a blind side-by-side comparison between a summary created by our model and a summary created by a human subject, the model-generated summary was voted for almost as often as the human-generated summary. This shows that even for a summarization task with low inter-rater agreement, a model can be trained that generates sensible summaries.

More information can be found on the project website:

suzan.txt · Last modified: 2019/02/06 16:03 (external edit)