Informationswissenschaft und Sprachtechnologie
Refine
Year of publication
Document Type
- Conference Proceeding (73)
- Master's Thesis (33)
- Doctoral Thesis (9)
- Article (1)
- Book (1)
- Habilitation (1)
Has Fulltext
- yes (118)
Is part of the Bibliography
- no (118)
Keywords
- Computerlinguistik (29)
- Informationssysteme (29)
- Korpus <Linguistik> (14)
- NER (13)
- Named entity recognition (13)
- corpus linguistics (13)
- Computerunterstützte Kommunikation (9)
- Information Retrieval (8)
- Opinion Mining (7)
- Sentiment Analyse (7)
Institute
Gefühle beeinflussen das menschliche Verhalten, indem sie beispielsweise zu bestimmten Handlungen motivieren, vergangene Erlebnisse bewerten und die soziale Interaktion prägen. Auch bei der Aktivität der Internetsuche spielen Gefühle als subjektive Empfindungen eine wichtige Rolle, sodass sie im Fachgebiet Information Seeking Behavior erforscht werden. Die vorliegende Arbeit ist in der Disziplin der Informationswissenschaft verortet und zielt darauf ab, das Wissen über die Gefühle der Suchenden zu erweitern und daraus konstruktive Schlussfolgerungen zu ziehen. Sie geht der Frage nach, wie die Informationssuche im Internet emotional erlebt wird und welche Bedingungen und Ursachen die Suchenden als bedeutsam für ihr emotionales Erleben bei der Onlinesuche betrachten. Um dies zu erforschen, wird ein methodologischer Rahmen verwendet, der sich diesem Thema auf ganz andere Art annähert, als bisherige Forschungsarbeiten auf diesem Gebiet: Die Grounded Theory-Methodologie. Durch deren Prinzipien des Fragenstellens und Vergleichens entsteht eine Theorie, die gleichzeitig interpretierend als auch empirisch fundiert ist. Als Datengrundlage dieser Theorie dienen Leitfadeninterviews, in denen junge Erwachsene aus den USA und Deutschland ihre Eindrücke und Empfindungen bei der Internetsuche schildern. Die Teilnehmenden beziehen sich dabei auf eine unmittelbar vor dem Interview durchgeführte Internetsuche, in der sie durch ein eigenes Informationsbedürfnis angeleitet wurden. Als Ergebnis der Studie zeigt sich zum einen, wie stark die individuellen Suchthemen die Gefühle der Suchenden beeinflussen. Zum anderen ergibt die Untersuchung, dass diejenigen Gefühle, die sich auf die Ausführung der Suche beziehen, erstaunlich gering ausgeprägt sind, denn die Internetsuche wird als normale Routinehandlung empfunden. Aufgrund dieser Erkenntnisse zur Individualität und Alltäglichkeit der Sucherfahrung formuliert die vorliegende Arbeit Vorschläge für eine bessere Unterstützung der Suchenden und für die zukünftige Erforschung der affektiven Ebene bei der Onlinesuche.
We present a first attempt at classifying German tweets by region using only the text of the tweets. German Twitter users are largely unwilling to share geolocation data. Here, we introduce a two-step process. First, we identify regionally salient tweets by comparing them to an "average" German tweet based on lexical features. Then, regionally salient tweets are assigned to one of 7 dialectal regions. We achieve an accuracy (on regional tweets) of up to 50% on a balanced corpus, much improved from the baseline. Finally, we show several directions in which this work can be extended and improved.
In this paper, we describe our system developed for the GErman SenTiment AnaLysis shared Task (GESTALT) for participation in the Maintask 2: Subjective Phrase and Aspect Extraction from Product Reviews. We present a tool, which identifies subjective and aspect phrases in German product reviews. For the recognition of subjective phrases, we pursue a lexicon-based approach. For the extraction of aspect phrases from the reviews, we consider two possible ways: Besides the subjectivity and aspect look-up, we also implemented a method to establish which subjective phrase belongs to which aspect. The system achieves better results for the recognition of aspect phrases than for the subjective identification.
We report on the two systems we built for Task 1 of the German Sentiment Analysis Shared Task, the task on Source, Subjective Expression and Target Extraction from Political Speeches (STEPS). The first system is a rule-based system relying on a predicate lexicon specifying extraction rules for verbs, nouns and adjectives, while the second is a translation-based system that has been obtained with the help of the (English) MPQA corpus.
We present the German Sentiment Analysis Shared Task (GESTALT) which consists of two main tasks: Source, Subjective Expression and Target Extraction from Political Speeches (STEPS) and Subjective Phrase and Aspect Extraction from Product Reviews (StAR). Both tasks focused on fine-grained sentiment analysis, extracting aspects and targets with their associated subjective expressions in the German language. STEPS focused on political discussions from a corpus of speeches in the Swiss parliament. StAR fostered the analysis of product reviews as they are available from the website Amazon.de. Each shared task led to one participating submission, providing baselines for future editions of this task and highlighting specific challenges. The shared task homepage can be found at https://sites.google.com/site/iggsasharedtask/.
In this paper, we present our Named Entity Recognition (NER) system for German – NERU (Named Entity Rules), which heavily relies on handcrafted rules as well as information gained from a cascade of existing external NER tools. The system combines large gazetteer lists, information obtained by comparison of different automatic translations and POS taggers. With NERU, we were able to achieve a score of 73.26% on the development set provided by the GermEval 2014 Named Entity Recognition Shared Task for German.
This paper presents a Named Entity Recognition system for German based on Conditional Random Fields. The model also includes language-independant features and features computed form large coverage lexical resources. Along side the results themselves, we show that by adding linguistic resources to a probabilistic model, the results improve significantly.
In the latest decades, machine learning approaches have been intensively experimented for natural language processing. Most of the time, systems rely on using statistics within the system, by analyzing texts at the token level and, for labelling tasks, categorizing each among possible classes. One may notice that previous symbolic approaches (e.g. transducers) where designed to delimit pieces of text. Our research team developped mXS, a system that aims at combining both approaches. It locates boundaries of entities by using sequential pattern mining and machine learning. This system, intially developped for French, has been adapted to German.
In this paper, we investigate a semi- supervised learning approach based on neu- ral networks for nested named entity recog- nition on the GermEval 2014 dataset. The dataset consists of triples of a word, a named entity associated with that word in the first-level and one in the second-level. Additionally, the tag distribution is highly skewed, that is, the number of occurrences of certain types of tags is too small. Hence, we present a unified neural network archi- tecture to deal with named entities in both levels simultaneously and to improve gen- eralization performance on the classes that have a small number of labelled examples.
In this paper we present Nessy (Named Entity Searching System) and its application to German in the context of the GermEval 2014 Named Entity Recognition Shared Task (Benikova et al., 2014a). We tackle the challenge by using a combination of machine learning (Naive Bayes classification) and rule-based methods. Altogether, Nessy achieves an F-score of 58.78% on the final test set.