MoSTNER: Morphology-aware Split-Tag German NER with Factorie

  • MoSTNER is a German NER system based on machine learning with log-linear models and morphology-aware features. We use morphological analysis with Morphisto for generating features, moreover we use German Wikipedia as a gazetteer and perform punctuation-aware and morphology-aware page title matching. We use four types of factor graphs where NER labels are single variables or split into prefix (BILOU) and type (PER, LOC, etc.) variables. Our system supports nested NER (two levels), for training we use SampleRank, for prediction Iterated Conditional Modes, the implementation is based on Python and Factorie.

Download full text files

Export metadata

Additional Services

Share in Twitter    Search Google Scholar    frontdoor_oas
Metadaten
Author:Peter Schüller
URN:https://nbn-resolving.org/urn:nbn:de:gbv:hil2-opus-3030
ISBN:978-3-934105-47-8
Document Type:Conference Proceeding
Language:English
Date of Publication (online):2014/11/25
Release Date:2014/11/25
Tag:NER; Named entity recognition
GND Keyword:Computerlinguistik
Source:Workshop Proceedings of the 12th KONVENS 2014
PPN:Link zum Katalog
Institutes:Fachbereich III / Informationswissenschaft und Sprachtechnologie
DDC classes:400 Sprache / 400 Sprache, Linguistik
Collections:KONVENS 2014 / Workshop Proceedings of the 12th KONVENS 2014
Access Rights:Frei zugänglich
Licence (German):License LogoCreative Commons - Namensnennung

$Rev: 13581 $