Refine
Year of publication
Document Type
- Conference Proceeding (171)
- Doctoral Thesis (158)
- Book (127)
- Master's Thesis (67)
- Article (63)
- Report (44)
- Part of a Book (38)
- Study Thesis (8)
- Lecture (7)
- Habilitation (3)
Language
- German (511)
- English (171)
- Multiple languages (4)
- French (1)
- Spanish (1)
Keywords
- Korpus <Linguistik> (42)
- Computerunterstützte Kommunikation (37)
- Soziale Medien (33)
- Pragmatik (30)
- Computerlinguistik (29)
- Informationssysteme (29)
- Linguistik (29)
- Social Media (29)
- Soziale Netzwerke (17)
- Leichte Sprache (13)
Institute
- Informationswissenschaft und Sprachtechnologie (118)
- Informatik (61)
- Fachbereich III (53)
- Übersetzungswissenschaft und Fachkommunikation (36)
- Fachbereich I (33)
- Fachbereich IV (21)
- Psychologie (18)
- Kulturpolitik (13)
- Sozial- und Organisationspädagogik (9)
- Sozialwissenschaften (9)
Im Patent Retrieval haben sich Rankingverfahren und Methoden wie Relevanz- Feedback noch nicht etabliert. An Ranking Systemen wird vor allem die mangelnde Transparenz für den Benutzer bemängelt. Das System PatentAide versucht, aufbauend auf einer Analyse der Rechercheprozesse im Patent Retrieval, ein Ranking-System zu implementieren. PatentAide unterstützt wichtige Techniken im Patent-Retrieval Prozess wie Term-Erweiterung, bietet ein geranktes Ergebnis und erlaubt darüber hinaus dynamisches Relevanz-Feedback.
In the context of genome research, the method of gene expression analysis has been used for several years. Related microarray experiments are conducted all over the world, and consequently, a vast amount of microarray data sets are produced. Having access to this variety of repositories, researchers would like to incorporate this data in their analyses to increase the statistical significance of their results. In this paper, we present a new two-phase clustering strategy which is based on the combination of local clustering results to obtain a global clustering. The advantage of such a technique is that each microarray data set can be normalized and clustered separately. The set of different relevant local clustering results is then used to calculate the global clustering result. Furthermore, we present an approach based on technical as well as biological quality measures to determine weighting factors for quantifying the local results proportion within the global result. The better the attested quality of the local results, the stronger their impact on the global result.
Machine learning, statistics and knowledge engineering provide a broad variety of supervised learning algorithms for classification. In this paper we introduce the Automated Model Selection Framework (AMSF) which presents automatic and semi-automatic methods to select classifiers. To achieve this we split up the selection process into three distinct phases. Two of those select algorithms by static rules which are derived from a manually created knowledgebase. At this stage of AMSF the user can choose between different rankers in the third phase. Currently, we use instance based learning and a scoring scheme for ranking the classifiers. After evaluation of different rankers we will recommend the most successful to the user by default. Besides describing the architecture and design issues, we additionally point out the versatile ways AMSF is applied in a production process of the automotive industry
Due to the inherent characteristics of data streams, appropriate mining techniques heavily rely on window-based processing and/or (approximating) data summaries. Because resources such as memory and CPU time for maintaining such summaries are usually limited, the quality of the mining results is affected in different ways. Based on Frequent Itemset Mining and an according Change Detection as selected mining techniques, we discuss in this paper extensions of stream mining algorithms allowing to determine the output quality for changes in the available resources (mainly memory space). Furthermore, we give directions how to estimate resource consumptions based on user-specified quality requirements.
Many ubiquitous computing applications so far fail to live up to their expectations. While working perfectly in controllable laboratory environments, they seemto be particularly prone to problems related to a discrepancy between user expectation and systems behavior when released into the wild. This kind of unwanted behavior of course prevents the vision of an emerging trend of context aware and adaptive applications inmobile and ubiquitous computing to become reality. In this paper, we present examples from our practical work and show why for ubiquitous computing unwanted behavior is not just a matter of enough requirements engineering and good or bad technical system verification. We furthermore provide a classification of the phenomenon and an analysis of the causes of its occurrence and resolvability in context aware and adaptive systems.
Das geographische Information Retrieval (GeoIR) berücksichtigt bei Suchanfragen – insb. nach Webseiten – neben dem Inhalt von Dokumenten auch eine räumliche Komponente, um gezielt nach Seiten suchen zu können, die für eine spezifische Region bedeutsam sind. Dazu müssen GeoIR-Systeme den geographischen Kontext einer Webseite erkennen können und in der Lage sein zu entscheiden, ob eine Seite überhaupt regional-spezifisch ("lokal") ist oder einen rein informativen Charakter besitzt, der keinen geographischen Bezug besitzt. Im Folgenden werden Ansätze vorgestellt, Merkmale lokaler Seiten zu ermitteln und diese für eine Einteilung von Webseiten in globale und lokale Seiten zu verwenden. Dabei sollen insbesondere die sprachlichen und geographischen Eigenschaften deutscher Webseiten berücksichtigt werden.
In this paper we present an interface for supporting a user in an interactive cross-language search process using semantic classes. In order to enable users to access multilingual information, different problems have to be solved: disambiguating and translating the query words, as well as categorizing and presenting the results appropriately. Therefore, we first give a brief introduction to word sense disambiguation, cross-language text retrieval and document categorization and finally describe recent achievements of our research towards an interactive multilingual retrieval system. We focus especially on the problem of browsing and navigation of the different word senses in one source and possibly several target languages. In the last part of the paper, we discuss the developed user interface and its functionalities in more detail.
Recently, there has been an increased interest in the exploitation of background knowledge in the context of text mining tasks, especially text classification. At the same time, kernel-based learning algorithms like Support Vector Machines have become a dominant paradigm in the text mining community. Amongst other reasons, this is also due to their capability to achieve more accurate learning results by replacing standard linear kernel (bag-of-words) with customized kernel functions which incorporate additional apriori knowledge. In this paper we propose a new approach to the design of ‘semantic smoothing kernels’ by means of an implicit superconcept expansion using well-known measures of term similarity. The experimental evaluation on two different datasets indicates that our approach consistently improves performance in situations where (i) training data is scarce or (ii) the bag-ofwords representation is too sparse to build stable models when using the linear kernel.
Der speziellen Behandlung geographischer Suchanfragen wird im Information Retrieval zunehmend mehr Beachtung geschenkt. So gibt der vorliegende Artikel einen Überblick über aktuelle Forschungsaktivitäten und zentrale Problemstellungen im Bereich des geographischen Information Retrieval, wobei speziell auf das Projekt GeoCLEF im Rahmen der crosslingualen Evaluierungsinitiative CLEF eingegangen wird. Die Informationswissenschaft der Universität Hildesheim hat in diesem Projekt sowohl organisatorische Aufgaben wahrgenommen als auch eigene Experimente durchgeführt. Dabei wurden die Aspekte der Verknüpfung von Gewichtungsansätzen mit Booleschem Retrieval sowie die Gewichtung von geographischen Eigennamen fokussiert. Anhand erster Interpretationen der Ergebnisse und Erfahrungen werden weiterer Forschungsbedarf und zukünftige, eigene Vorhaben wie die Überprüfung von Heuristiken zur Query-Expansion aufgezeigt.
The existing personalization systems typically base their services on general user models that ignore the issue of context-awareness. This position paper focuses on developing mechanisms for cross-context reasoning of the user models, which can be applied for the context-aware personalization. The reasoning augments the sparse user models by inferring the missing information from other contextual conditions. Thus, it upgrades the existing personalization systems and facilitates provision of accurate context-aware services.
Entwicklung eines dynamischen Entry Vocabulary Moduls für die Stiftung Wissenschaft und Politik
(2006)
Nicht übereinstimmendes Vokabular zwischen Anfrage und Dokumenten stellt ein Hauptproblem im Information Retrieval dar. Das Entry Vocabulary Modul hat sich in den letzten Jahren als Lösung hierfür etabliert. In diesem Beitrag wird ein dynamisches Entry Vocabulary Modul vorgestellt, das für einen Datenbestand mit mehreren inhaltsbezogenen Feldern in einem mehrstufigen Verfahren abhängig von Zwischenergebnissen die Anfrage erweitert. Das entwickelte System wurde anhand eines mehrsprachigen Datenbestands von rund 600.000 Fachtexten evaluiert und führte zu positiven Ergebnissen.
The exchange of personal experiences is a way of supporting decision making and interpersonal communication. In this article, we discuss how augmented personal memories could be exploited in order to support such a sharing. We start with a brief summary of a system implementing an augmented memory for a single user. Then, we exploit results from interviews to define an example scenario involving sharable memories. This scenario serves as background for a discussion of various questions related to sharing memories and potential approaches to their solution. We especially focus on the selection of relevant experiences and sharing partners, sharing methods, and the configuration of those sharing methods by means of reflection.
In this work we describe a "semantic personalization" web service for curriculum planning. Based on a semantic annotation of a set of courses, provided by the University of Hannover, reasoning about actions and change —in particular classical planning— are exploited for creating personalized curricula, i.e. for selecting and sequencing a set of courses which will allow a student to achieve her learning goal. The specific student's context is taken into account during the process: students with different initial knowledge will be suggested different solutions. The Curriculum Planning Service has been integrated as a new plug-and-play personalization service in the Personal Reader framework.
User Centric Hierarchical Classification and Associated Evaluation Measures for Document Retrieval
(2006)
In diesem Beitrag wird ein Ansatz vorgestellt, der basierend auf Techniken der visuellen Daten- Exploration und semantikbasierten Fusion eine Nutzung von Analysemethoden wie Data- Mining- und Visualisierungstechniken zur Wissensgenerierung in verteilten, kooperativen Umgebungen erlaubt. Unter Einsatz von Ontologien zur semantischen Beschreibung verteilter Quellen wird es ermöglicht, die Daten und Analysemethoden aus diesen Quellen zu fusionieren. Kern der Architektur ist die Gatewaykomponente, die es dem Analysten erlaubt, Daten und Analysemethoden in einer verteilten Umgebung zu nutzen. Im Rahmen eines medizinischen Anwendungsszenarios wurden die vorgestellten Komponenten evaluiert.
In this paper, we propose a case-based approach for characterizing and analyzing subgroup patterns: We present techniques for retrieving characteristic factors and cases, and merge these into prototypical cases for presentation to the user. In general, cases capture knowledge and concrete experiences of specific situations. By exploiting case-based knowledge for characterizing a subgroup pattern, we can provide additional information about the subgroup extension. We can then present the subgroup pattern in an alternative condensed form that characterizes the subgroup, and enables a convenient retrieval of interesting associated (meta-)information.
This paper presents results from an initial user study exploring the relationship between system effectiveness as quantified by traditional measures such as precision and recall, and users’ effectiveness and satisfaction of the results. The tasks involve finding images for recall-based tasks. It was concluded that no direct relationship between system effectiveness and users’ performance could be proven (as shown by previous research). People learn to adapt to a system regardless of its effectiveness. This study recommends that a combination of attributes (e.g. system effectiveness, user performance and satisfaction) is a more effective way to evaluate interactive retrieval systems. Results of this study also reveal that users are more concerned with accuracy than coverage of the search results.
This paper reports on experiments that attempt to characterize the relationship between users and their knowledge of the search topic in a Question Answering (QA) system. It also investigates user search behavior with respect to the length of answers presented by a QA system. Two lengths of answers were compared; snippets (one to two sentences of text) and exact answers. A user test was conducted, 92 factoid questions were judged by 44 participants, to explore the participants’ preferences, feelings and opinions about QA system tasks. The conclusions drawn from the results were that participants preferred and obtained higher accuracy in finding answers from the snippets set. However, accuracy varied according to users’ topic familiarity; users were only substantially helped by the wider context of a snippet if they were already familiar with the topic of the question, without such familiarity, users were about as accurate at locating answers from the snippets as they were in exact set.
The Personal Reader Framework enables the design, realization and maintenance of personalized Web Content Reader. In this architecture personalized access to web content is realized by various Web Services - we call them Personalization Services. With our new approach of Configurable Web Services we allow users to configure these Personalization Services. Such configurations can be stored and reused at a later time. The interface between Users and Configurable Web Services is realized in a Personal Reader Agent. This Agent allows selection, configuration and calling of the Web Services and further provides personalization functionalities like reuse of stored configurations which suit the users interests.