GermanPolarityClues - A Lexical Resource for German Sentiment Analysis

GermanPolarityClues – A Lexical Resource for German Sentiment Analysis

Feel free to use/download the GermanPolarityClues dictionary, a new publicly available lexical resource for sentiment analysis for the German language. The resource offers a number of 10.141 polarity features, associated to three numerical polarity scores, determining the positive, negative and neutral direction of specific term features. We empirically showed that GermanPolarityClues is a valuable resource for the task of feature selection, polarity identification and sentiment analysis for the German language (f1 = 0:88) by automatic means.


Introduction

* Update - 21.04.2012: updated sentiWs synchronization
* Update - 22.03.2011: updated corpus frequencies

GermanPolarityClues is a semi-automatic translation approach of existing English-based sentiment resources to the German language by means of three different datasets:

Data Format

Version 0.2 (April, 2012) dataset GermanPolarityClues-2012 dataset is synchronized with SentiWS v1.8c (Remus et al, 2010)

Version 0.1 (March, 2011) dataset GermanPolarityClues-2011 dataset is synchronized with SubjectivityClues (Wiebe et al. 2005), and SentiSpin (Takamura et al., 2005)

Each line in each text file corresponds to a single polarity entry. Each entry is separated by tabulator (\t) e.g.:

March, 2011 Format

    Feature    (\t)     Lemma     (\t)  Part-of-Speech (\t)  PositiveRating (\t)  NegativeRating (\t)  NeutralRating (\t)
    überzahlte	        überzahlen	     V	                 0	                   1	                0

    PositiveCorpusProbability (\t)  NegativeCorpusProbability (\t)  NeutralCorpusProbability
    0	                            0.6	        		     0.4

April, 2012 Format

    Feature    (\t)     Lemma     (\t)  Part-of-Speech (\t)  Polarity (\t)  Probability (\t)
    hoffnungsloser	    hoffnungslos	AD					 negative	    -/-0.3412/