Tagging Complex Non-Verbal German Chunks with Conditional Random Fields
- We report on chunk tagging methods for German that recognize complex non-verbal phrases using structural chunk tags with Conditional Random Fields (CRFs). This state-of-the-art method for sequence classification achieves 93.5% accuracy on newspaper text. For the same task, a classical trigram tagger approach based on Hidden Markov Models reaches a baseline of 88.1%. CRFs allow for a clean and principled integration of linguistic knowledge such as part-of-speech tags, morphological constraints and lemmas. The structural chunk tags encode phrase structures up to a depth of 3 syntactic nodes. They include complex prenominal and postnominal modifiers that occur frequently in German noun phrases.
Author: | Luzia Roth, Simon Clematide |
---|---|
URN: | https://nbn-resolving.org/urn:nbn:de:gbv:hil2-opus-2673 |
Parent Title (English): | Proceedings of the 12th edition of the KONVENS conference |
Document Type: | Conference Proceeding |
Language: | English |
Date of Publication (online): | 2014/10/22 |
Release Date: | 2014/10/22 |
Tag: | Chunking; Grammar; Parser; Syntax |
GND Keyword: | Chunking; Grammatik; Parser; Syntax |
First Page: | 48 |
Last Page: | 57 |
PPN: | Link zum Katalog |
Institutes: | Fachbereich III / Informationswissenschaft und Sprachtechnologie |
DDC classes: | 400 Sprache / 400 Sprache, Linguistik |
Collections: | KONVENS 2014 / Proceedings of the 12th KONVENS 2014 |
Licence (German): | ![]() |