
GermanPolarityClues – A Lexical Resource for German Sentiment Analysis
Feel free to use/download the GermanPolarityClues dictionary, a new publicly available lexical resource for sentiment analysis for the German language. The resource offers a number of 10.141 polarity features, associated to three numerical polarity scores, determining the positive, negative and neutral direction of specific term features. We empirically showed that GermanPolarityClues is a valuable resource for the task of feature selection, polarity identification and sentiment analysis for the German language (f1 = 0:88) by automatic means.
Introduction
- Version 0.2 (April, 2012) for the GermanPolarityClues dataset (SentiWS Sync)
- Version 0.1 (March, 2011) for the GermanPolarityClues dataset (SubjectivityClues and SentiSpin Sync)
- URL: http://www.ulliwaltinger.de/sentiment/
- URL: http://hudesktop.hucompute.org/
* Update - 21.04.2012: updated sentiWs synchronization * Update - 22.03.2011: updated corpus frequencies
GermanPolarityClues is a semi-automatic translation approach of existing English-based sentiment resources to the German language by means of three different datasets:
- 1. Translation of the Subjectivity Clues dictionary (Wiebe et al., 2005)
- 2. Translation of the of SentiSpin dictionary (Takamura et al., 2005)
- 3. GermanPolarityClues dictionary – as an extension of the Subjectivity Clues (Waltinger, 2010)
Data Format
Version 0.2 (April, 2012) dataset GermanPolarityClues-2012 dataset is synchronized with SentiWS v1.8c (Remus et al, 2010)
Version 0.1 (March, 2011) dataset GermanPolarityClues-2011 dataset is synchronized with SubjectivityClues (Wiebe et al. 2005), and SentiSpin (Takamura et al., 2005)
- GermanPolarityClues-2012.zip: contains six dictionary files:
- 1. GermanPolarityClues-Negative-21042012.tsv
- 2. GermanPolarityClues-Negative-Lemma-21042012.tsv
- 3. GermanPolarityClues-Positive-21042012.tsv
- 4. GermanPolarityClues-Positive-Lemma-21042012.tsv
- 5. GermanPolarityClues-Neutral-21042012.tsv
- 6. GermanPolarityClues-Neutral-Lemma-21042012.tsv
Each line in each text file corresponds to a single polarity entry. Each entry is separated by tabulator (\t) e.g.:
March, 2011 Format
Feature (\t) Lemma (\t) Part-of-Speech (\t) PositiveRating (\t) NegativeRating (\t) NeutralRating (\t) überzahlte überzahlen V 0 1 0 PositiveCorpusProbability (\t) NegativeCorpusProbability (\t) NeutralCorpusProbability 0 0.6 0.4
April, 2012 Format
Feature (\t) Lemma (\t) Part-of-Speech (\t) Polarity (\t) Probability (\t) hoffnungsloser hoffnungslos AD negative -/-0.3412/