Software

Wikipedia Question Answering (WikiQA)

WikiQA is a German open domain question answering system that uses the Wikipedia as a knowledge base to answer natural language questions. It has been developed by the KnowCIT project (Artificial Intelligence Group) within the CITEC at Bielefeld University. Using open domain encyclopedic information as a knowledge base, such as provided by the Wikipedia project, has captured the attention of QA researchers lately. However, most of the proposed Wikipedia-based QA systems focus primarily on the document collection of Wikipedia for answer retrieval, and thus disregard the complex hierarchical representation of knowledge by means of its category taxonomy, which can also be valuable in the context of QA systems. The WikiQA system approaches the Wikipedia collection from a different point of view. It exploits the use of the Wikipedia category taxonomy as a reference point for identifying the broader topic of a user’s question, in order to deduce from the topic a set of expected answer candidates. More precisely, it accesses and activates only those areas of our knowledge base which are primarily topically relevant to the questions subject.

 

Wikipedia OpenTopicModel

Open Topic Model (OTM) is a text analysis tool written in C++ (Qt Development Frameworks) and ported to Java and Apache Lucene. It considers the problem of topic identification by means of Open Topic Models. That is, we are not heading towards a clustering of a document collection but labelling individual documents with the best fitting topic names obtained from a social ontology. The current OTM implementation utilizes over 55,000 different Wikipedia categories as topic labels and combines both keyword extraction as a type of text representation and categorization by means of topic labelling.

 

German Polarity Clues

A Lexical Resource for German Sentiment Analysis: Feel free to use/download the GermanPolarityClues dictionary A new publicly available lexical resource for sentiment analysis for the German language The resource offers a number of 10.141 polarity features, associated to three numerical polarity scores, determining the positive, negative and neutral direction of specific term features

 

eHumanities Desktop

I was one of the Core-Developer in the developement (ExtJS/Java) of the web-based desktop system – The eHumanities Desktop. It allows to explore, process and analyse resources within an intuitive desktop environment – similar to the Windows Desktop. You can upload, organize, and share different resources and media, but also different computational linguistics applications such as classification, tagging or topic labeling (see hucompute.org for more information).