Volltext-Downloads (blau) und Frontdoor-Views (grau)

Improving the Performance of Standard Part-of-Speech Taggers for Computer-Mediated Communication

  • We assess the performance of off-the-shelve POS taggers when applied to two types of Internet texts in German, and investigate easy-to-implement methods to improve tagger performance. Our main findings are that extending a standard training set with small amounts of manually annotated data for Internet texts leads to a substantial improvement of tagger performance, which can be further improved by using a previously proposed method to automatically acquire training data. As a prerequisite for the evaluation, we create a manually annotated corpus of Internet forum and chat texts.

Download full text files

  • Main Conference Proceedings of the 12th Konvens 2014

Export metadata

Additional Services

Share in Twitter    Search Google Scholar    frontdoor_oas
Author:Andrea Horbach, Diana Steffen, Stefan Thater, Manfred Pinkal
Parent Title (English):Proceedings of the 12th edition of the KONVENS conference
Document Type:Conference Proceeding
Date of Publication (online):2014/10/23
Release Date:2014/10/23
Tag:Annotation von Wortarten
morphology; phonetics; phonology; segmenation; tagging
GND Keyword:Morphologie; Phonetik; Phonologie; Segmentierung
First Page:171
Last Page:177
PPN:Link zum Katalog
Institutes:Fachbereich III / Informationswissenschaft und Sprachtechnologie
DDC classes:400 Sprache / 400 Sprache, Linguistik
Collections:KONVENS 2014 / Proceedings of the 12th KONVENS 2014
Licence (German):License LogoCreative Commons - Namensnennung 3.0