Master's Thesis

Conversation-aware Classification of News Comments

By Johannes Filter at the Web Science Group at the Hasso-Plattner-Institute (University of Potsdam) under the supervision of Dr. Ralf Krestel. April 2019.

Abstract

Online newspapers close their comment section because they cannot cope with the sheer amount of user-generated content. Natural-language processing allows to automatically classify news comments in order to efficiently support moderators. Identifying hate speech is only a special case of comment classification and in this master's thesis we focus on classifying along any classification criteria, e.g., sentiment, off-topic, controversial. In contrast to prior work, we consider the conversational context to be essential for understanding a comment's true meaning. We introduce a preprocessing technique to prepend previous comments to training samples in order to apply state-of-the-art language-model-based text classification technique ULMFIT. We conducted experiments on nine categories of the research dataset Yahoo News Annotated Comment Corpus. With conversation-aware models, we increased the F1 micro and F1 macro scores on average by 1.53% and 3.08%, respectively. However, the differences to conversation-agnostic models vary among the categories. We achieved the biggest improvements when identifying whether a comment is off-topic or if it agrees or disagrees with other comments.

Caveats

This repository contains all the used code but it's rather unstructured. Nevertheless, maybe it can helpful for your work.

License

MIT.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
code		code
meetings		meetings
presentations		presentations
writing		writing
.gitignore		.gitignore
LICENSE		LICENSE
Masters_Thesis_Johannes_Filter_Conversationaware_Classification_of_News_Comments.pdf		Masters_Thesis_Johannes_Filter_Conversationaware_Classification_of_News_Comments.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

code

code

meetings

meetings

presentations

presentations

writing

writing

.gitignore

.gitignore

LICENSE

LICENSE

Masters_Thesis_Johannes_Filter_Conversationaware_Classification_of_News_Comments.pdf

Masters_Thesis_Johannes_Filter_Conversationaware_Classification_of_News_Comments.pdf

README.md

README.md

Repository files navigation

Master's Thesis

Abstract

Caveats

License

About

Releases

Packages

Languages

License

jfilter/masters-thesis

Folders and files

Latest commit

History

Repository files navigation

Master's Thesis

Abstract

Caveats

License

About

Topics

Resources

License

Stars

Watchers

Forks

Languages