Volltext-Downloads (blau) und Frontdoor-Views (grau)

Corpus-Based Linguistic Typology: A Comprehensive Approach

  • This paper will have a holistic view at the field of corpus-based linguistic typology and present an overview of current advances at Leipzig University. Our goal is to use automatically created text data for a large variety of languages for quantitative typological investigations. In our approaches we utilize text corpora created for several hundred languages for cross-language quantitative studies using mathematically well-founded methods (Cysouw, 2005). These analyses include the measurement of textual characteristics. Basic requirements for the use of these parameters are also discussed. The measured values are then utilized for typological studies. Using quantitative methods, correlations of measured properties of corpora among themselves or with classical typological parameters are detected. Our work can be considered as an automatic and language-independent process chain, thus allowing extensive investigations of the various languages of the world.

Download full text files

  • Main Conference Proceedings of the 12th Konvens 2014

Export metadata

Additional Services

Share in Twitter    Search Google Scholar    frontdoor_oas
Author:Dirk Goldhahn, Uwe Quasthoff, Gerhard Heyer
Parent Title (English):Proceedings of the 12th edition of the KONVENS conference
Document Type:Conference Proceeding
Date of Publication (online):2014/10/23
Release Date:2014/10/23
Tag:Multilinguale Systeme
machine translation; multilingual systems
GND Keyword:Maschinelle Übersetzung
First Page:215
Last Page:221
PPN:Link zum Katalog
Contributor:Faaß, Gertrud
Institutes:Fachbereich III / Informationswissenschaft und Sprachtechnologie
DDC classes:400 Sprache / 400 Sprache, Linguistik
Collections:KONVENS 2014 / Proceedings of the 12th KONVENS 2014
Licence (German):License LogoCreative Commons - Namensnennung 3.0