Universität
In distributional semantics, the unsupervised learning approach has been widely used for a large number of tasks. On the other hand, supervised learning has less coverage.
In this dissertation, we investigate the supervised learning approach for semantic relatedness tasks in distributional semantics. The investigation considers mainly semantic similarity and semantic classification tasks. Existing and newly-constructed datasets are used as an input for the experiments. The new datasets are constructed from thesauruses like Eurovoc. The Eurovoc thesaurus is a multilingual thesaurus maintained by the Publications Office of the European Union. The meaning of the words in the dataset is represented by using a distributional semantic approach.
The distributional semantic approach collects co-occurrence information from large texts and represents the words in high-dimensional vectors. The English words are represented by using UkWaK corpus while German words are represented by using DeWaC corpus. After representing each word by the high dimensional vector, different supervised machine learning methods are used on the selected tasks. The outputs from the supervised machine learning methods are evaluated by comparing the tasks performance and accuracy with the state of the art unsupervised machine learning methods’ results. In addition, multi-relational matrix factorization is introduced as one supervised learning method in distributional semantics. This dissertation shows the multi-relational matrix factorization method as a good alternative method to integrate different sources of information of words in distributional semantics.
In the dissertation, some new applications are also introduced. One of the applications is an application which analyzes a German company’s website text, and provides information about the company with a concept cloud visualization. The other applications are automatic recognition/disambiguation of the library of congress subject headings and automatic identification of synonym relations in the Dutch Parliament thesaurus applications.
Die sich wandelnden Rahmenbedingungen – mit den Begriffen Digitalisierung, Diversität und Fachkräftemangel schlagwortartig beleuchtet – stellen die Universität kontinuierlich vor neue Herausforderungen. Diese sind nur gemeinsam mit den Mitarbeiter_innen zu lösen. Der Personalentwicklung kommt bei der Begleitung dieser Prozesse eine zunehmend wichtigere Bedeutung zu.
Mit dem vorliegenden Konzept hat die Universität die Basis geschaffen, um Maßnahmen in den relevanten Handlungsfeldern festzulegen und diese im Sinne des reflexiv lernenden Handelns immer wieder neu zu hinterfragen und weiterzuentwickeln – ganz im Interesse der Universität und ihrer Mitarbeiter_innen.