Volltext-Downloads (blau) und Frontdoor-Views (grau)

Invariant Features For Time-Series Classification

  • Time series represent the most widely spread type of data, occurring in a myriad of application domains, ranging from physiological sensors up to astronomical light intensities. The classification of time-series is one of the most prominent challenges, which utilizes a recorded set of expert-labeled time-series, in order to automatically predict the label of future series without the need of an expert.The patterns of time-series are often shifted in time, have different scales, contain arbitrarily repeating patterns and exhibit local distortions/noise. In other cases, the differences among classes are attributed to small local segments, rather than the global structure. For those reasons, values corresponding to a particular time-stamp have different semantics on different time-series. We call this phenomena as intra-class variations. The lion's share of this thesis is composed of presenting new methods that can accurately classify time-series instances, by handling variations. The answer towards resolving the bottlenecks of intra-class variations relies on not using the time-series values as direct features. Instead, the approach of this thesis is to extract a set of features that, on one hand, represent all the variations of the data and, on the other hand, can boost classification accuracy. In other words, this thesis proposes a list of methods that addresses diverse aspects of intra-class variations. The first proposed approach is to generate new training instances, by transforming the support vectors of an SVM. The second approach decomposes time-series through a segment-wise convolutional factorization. The strategy involves learning a set of patterns and weights, whose product can approximate each sub-sequence of the time series. However, the main contribution of the thesis is the third approach, called shapelet learning, which utilizes the training labels during the learning process, i.e. the process is supervised. Since the features are learned on the training labels, there is a higher tendency of performing strongly in terms of predicting the testing labels. In addition, we present a fast alternative method for shapelet discovery. Our strategy is to prune segment candidates using a two step approach. First of all, we prune candidates based on their similarity towards previously considered candidates. Secondly, non-similar (hence diverse) candidates are selected only if the features they produce improve the classification results. The last two chapters of the thesis describes two methods that extract features from datasets having special characteristics. More concretely, we propose a classification method suited for series having missing values, as well as a method that extract features from time series having repetitive patterns.

Download full text files

Export metadata

Additional Services

Share in Twitter    Search Google Scholar    frontdoor_oas
Metadaten
Author:Josif Grabocka
URN:https://nbn-resolving.org/urn:nbn:de:gbv:hil2-opus4-5211
Place of publication:Hildesheim
Referee:Lars Schmidt-Thieme, Mustafa Baydogan
Advisor:Lars Schmidt-Thieme
Document Type:Doctoral Thesis
Language:English
Year of Completion:2016
Granting Institution:Universität Hildesheim, Fachbereich IV
Date of final exam:2016/01/14
Release Date:2016/03/09
Tag:Machine Learning, Data Mining, Time-series classification
Pagenumber:196 Seiten
PPN:Link zum Katalog
Institutes:Fachbereich IV
DDC classes:000 Allgemeines, Informatik, Informationswissenschaft / 000 Allgemeines, Wissenschaft / 005 Computerprogrammierung, Programme, Daten
Licence (German):License LogoCreative Commons - Namensnennung 3.0