Online Dimensionality Reduction using Competitive Learning and Radial Basis Function Network

TomenkoVladimir Tomenko obtained his Bachelor degree from Kharkiv National University of Radioelectronics, and passed his Master of Philosophy at the Wessex Institute of Technology. Vladimir continued his research at WIT and has now successfully passed his PhD viva with the thesis “Online Dimensionality Reduction using Competitive Learning and Radial Basis Function Network”. His external examiner was Prof Nigel Allinson from the University of Sheffield and the internal was Prof Alex Galybin.

The main objective of Vladimir’s work was to develop the dimensionality reduction method,to be applied for different purposes, such as feature extraction and visualization. It is known that machine learning methods suffer from high dimensionality of data space. Therefore, the first practical step involves reduction of data dimensionality, thus capturing latent or hidden variables governing the process of interest.

Classical approaches, such as the Principal Component Analysis and Multidimensional Scaling, fail to recover latent variables if data manifold is embedded nonlinearly in observation space. Advanced manifold learning methods, in turn, require additional optimization for previously unseen data patterns.

Neural Networks (NN) provide an invaluable tool for synthesis of adaptive and computationally less demanding algorithms. Vladimir has developed the NN-based method which preserves data structures globally, learns nonlinearly embedded manifolds and processes novel patterns in online mode. Furthermore, the method is scalable, i.e. works with massive datasets and streaming data, which is especially useful for real world problems.

The results of visualization experiments for linear and nonlinear embeddings, artificial and real-world data indicate the method’s applicability for a wide class of problems. With respect to feature extraction, the method was proved to be advantageous compared to PCA when applied to preprocess real-world data describing wastewater treatment process. Finally, the method provides extensible framework for e.g. tracking nonstationary processes and learning complex manifolds.