University of Jyväskylä

Dissertation: 14.12.2017 Intelligent solutions for real-life data-driven applications" (Ivannikova)

Start date: Dec 14, 2017 12:00 PM

End date: Dec 14, 2017 03:00 PM

Location: Mattilanniemi, Agora Beeta

Elena Ivannikova picture: Rufina Valieva / Spice and Ice<br /> Photography
Elena Ivannikova defends her doctoral dissertation in Mathematical Information Technology "Intelligent solutions for real-life data-driven applications". Opponent Docent Xiao-Zhi Gao (Aalto University) and custos Professor Timo Hämäläinen (University of Jyväskylä). The doctoral disseration is held in English.


The subject of this thesis belongs to the topic of machine learning or, specifically, to the development of advanced methods for regression analysis, clustering, and anomaly detection. Industry is constantly seeking improved production practices and minimized production time and costs. In connection to this, several industrial case studies are presented in which mathematical models for predicting paper quality were proposed. The most important variables for the prediction models are selected based on information-theoretic measures and regression trees approach. 

The rest of the original papers are devoted to unsupervised machine learning. The main focus is developing advanced spectral clustering techniques for community detection and anomaly detection. As part of these efforts, a number of enhancements for the dependence clustering algorithm have been proposed. These enhancements include adding regularization for controlling the size of clusters, extension to the ensemble version for improving model stability, handling overlapping clusters, and adaptation to solving anomaly detection problems and handling big datasets. Another focus of the thesis is on developing anomaly detection algorithms for network security data. In connection to this, a probabilistic transition-based approach is proposed for detecting application-layer distributed denial-of-service attacks. 

The developed approaches are tested on real datasets and are capable of efficiently solving the given tasks with high accuracy and good performance. They are shown to be applicable to solving variable selection, graph segmentation, and anomaly detection tasks in different applications.

More information

Elena Ivannikova
Filed under: