Feel free to use/download the GermanPolarityClues dictionary, a new publicly available lexical resource for sentiment analysis for the German language. The resource offers a number of 10.141 polarity features, associated to three numerical polarity scores, determining the positive, negative and neutral direction of specific term features. We empirically showed that GermanPolarityClues is a valuable resource for the task of feature selection, polarity identification and sentiment analysis for the German language (f1 = 0:88) by automatic means.
* Update - 21.04.2012: updated sentiWs synchronization * Update - 22.03.2011: updated corpus frequencies
GermanPolarityClues is a semi-automatic translation approach of existing English-based sentiment resources to the German language by means of three different datasets:
These data sets were first used in: Ulli Waltinger (2010). Sentiment Analysis Reloaded: A Comparative Study On Sentiment Polarity Identification Combining Machine Learning And Subjectivity Features. In Proceedings of the 6th International Conference on Web Information Systems and Technologies (WEBIST '10), April 7-10, 2010, Valencia, 2010
@inproceedings{Waltinger:2010:a, author={Ulli Waltinger}, title={GERMANPOLARITYCLUES: A Lexical Resource for German Sentiment Analysis}, publisher = {electronic proceedings}, address = {Valletta, Malta}, booktitle = {Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC)}, month = {May}, year={2010} } @inproceedings{Waltinger:2010:b, author={Ulli Waltinger}, title={Sentiment Analysis Reloaded: A Comparative Study On Sentiment Polarity Identification Combining Machine Learning And Subjectivity Features}, address = {Valencia, Spain}, booktitle={Proceedings of the 6th International Conference on Web Information Systems and Technologies (WEBIST '10)}, month = {April}, year={2010} }
Version 0.2 (April, 2012) dataset GermanPolarityClues-2012 dataset is synchronized with SentiWS v1.8c (Remus et al, 2010)
Version 0.1 (March, 2011) dataset GermanPolarityClues-2011 dataset is synchronized with SubjectivityClues (Wiebe et al. 2005), and SentiSpin (Takamura et al., 2005)
Each line in each text file corresponds to a single polarity entry. Each entry is separated by tabulator (\t) e.g.:
Feature (\t) Lemma (\t) Part-of-Speech (\t) PositiveRating (\t) NegativeRating (\t) NeutralRating (\t) überzahlte überzahlen V 0 1 0 PositiveCorpusProbability (\t) NegativeCorpusProbability (\t) NeutralCorpusProbability 0 0.6 0.4
Feature (\t) Lemma (\t) Part-of-Speech (\t) Polarity (\t) Probability (\t) hoffnungsloser hoffnungslos AD negative -/-0.3412/
Ulli Waltinger
ulli_marc.waltinger@uni-bielefeld.de
www.ulliwaltinger.de
University Bielefeld
Texttechnology
Universitätsstraße 25
33615 Bielefeld
Germany
@inproceedings{remquahey2010, title = {SentiWS -- a Publicly Available German-language Resource for Sentiment Analysis}, booktitle = {Proceedings of the 7th International Language Resources and Evaluation (LREC'10)}, author = {Remus, R. and Quasthoff, U. and Heyer, G.}, year = {2010} } @inproceedings{WilWiebe:2005, address = {Mexico City, MX}, author = {Wiebe, Janyce and Riloff, Ellen}, booktitle = {Proc. of CICLing-05}, pages = {475--486}, publisher = {Springer-Verlag}, title = {Creating Subjective and Objective Sentence Classifiers from Unannotated Texts}, volume = {3406}, year = {2005} } @inproceedings{Takamura:2005, author = {Takamura, Hiroya and Inui, Takashi and Okumura, Manabu}, title = {Extracting semantic orientations of words using spin model}, booktitle = {Proc. of ACL '05}, year = {2005}, pages = {133--140}, publisher = {ACL}, }
We gratefully acknowledge financial support of the German Research Foundation (DFG) through the EC 277 Cognitive Interaction Technology at Bielefeld University.
License
GermanPolarityClues is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.