GermanPolarityClues - A Lexical Resource for German Sentiment Analysis

Feel free to use/download the GermanPolarityClues dictionary, a new publicly available lexical resource for sentiment analysis for the German language. The resource offers a number of 10.141 polarity features, associated to three numerical polarity scores, determining the positive, negative and neutral direction of specific term features. We empirically showed that GermanPolarityClues is a valuable resource for the task of feature selection, polarity identification and sentiment analysis for the German language (f1 = 0:88) by automatic means.


* Update - 21.04.2012: updated sentiWs synchronization
* Update - 22.03.2011: updated corpus frequencies

GermanPolarityClues is a semi-automatic translation approach of existing English-based sentiment resources to the German language by means of three different datasets:

Citation Info

These data sets were first used in: Ulli Waltinger (2010). Sentiment Analysis Reloaded: A Comparative Study On Sentiment Polarity Identification Combining Machine Learning And Subjectivity Features. In Proceedings of the 6th International Conference on Web Information Systems and Technologies (WEBIST '10), April 7-10, 2010, Valencia, 2010

    author={Ulli Waltinger},
    title={GERMANPOLARITYCLUES: A Lexical Resource for German Sentiment Analysis},
	publisher = {electronic proceedings},
	address = {Valletta, Malta},
	booktitle = {Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC)},
	month = {May},

   author={Ulli Waltinger},
   title={Sentiment Analysis Reloaded: A Comparative Study On Sentiment Polarity Identification 
          Combining Machine Learning And Subjectivity Features},
   address = {Valencia, Spain},
   booktitle={Proceedings of the 6th International Conference on Web Information Systems and Technologies (WEBIST '10)},
   month = {April},

Data Format

Version 0.2 (April, 2012) dataset GermanPolarityClues-2012 dataset is synchronized with SentiWS v1.8c (Remus et al, 2010)

Version 0.1 (March, 2011) dataset GermanPolarityClues-2011 dataset is synchronized with SubjectivityClues (Wiebe et al. 2005), and SentiSpin (Takamura et al., 2005)

Each line in each text file corresponds to a single polarity entry. Each entry is separated by tabulator (\t) e.g.:

March, 2011 Format

    Feature    (\t)     Lemma     (\t)  Part-of-Speech (\t)  PositiveRating (\t)  NegativeRating (\t)  NeutralRating (\t)
    überzahlte	        überzahlen	     V	                 0	                   1	                0

    PositiveCorpusProbability (\t)  NegativeCorpusProbability (\t)  NeutralCorpusProbability
    0	                            0.6	        		     0.4

April, 2012 Format

    Feature    (\t)     Lemma     (\t)  Part-of-Speech (\t)  Polarity (\t)  Probability (\t)
    hoffnungsloser	    hoffnungslos	AD					 negative	    -/-0.3412/


Ulli Waltinger

University Bielefeld
Universitätsstraße 25
33615 Bielefeld


	title = {SentiWS -- a Publicly Available German-language Resource for Sentiment Analysis},
	booktitle = {Proceedings of the 7th International Language Resources and Evaluation (LREC'10)},
	author = {Remus, R. and Quasthoff, U. and Heyer, G.},
	year = {2010}

    address = {Mexico City, MX},
    author = {Wiebe, Janyce and Riloff, Ellen},
    booktitle = {Proc. of CICLing-05},
    pages = {475--486},
    publisher = {Springer-Verlag},
    title = {Creating Subjective and Objective Sentence Classifiers from Unannotated Texts},
    volume = {3406},
    year = {2005}

 author = {Takamura, Hiroya and Inui, Takashi and Okumura, Manabu},
 title = {Extracting semantic orientations of words using spin model},
 booktitle = {Proc. of ACL '05},
 year = {2005},
 pages = {133--140},
 publisher = {ACL},


We gratefully acknowledge financial support of the German Research Foundation (DFG) through the EC 277 Cognitive Interaction Technology at Bielefeld University.


Creative Commons License
GermanPolarityClues is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.