Selected Methods for non-Gaussian Data Analysis

TitleSelected Methods for non-Gaussian Data Analysis
Publication TypeBook
Year of Publication2019
AuthorsDomino K
Publisher IITiS PAN
ISBNISBN: 978-83-926054-3-0
Keywordscopulas, features selection, Financial data analysis, Higher order multivariate cumulants, information extraction, non-Gaussian distributions, tensor analysis

The basic goal of computer engineering is the analysis of data. Such data are often large data sets distributed according to various distribution models. In this manuscript we focus on the analysis of non-Gaussian distributed data. In the case of univariate data analysis we discuss stochastic processes with auto-correlated increments and univariate distributions derived from specific stochastic processes, i.e. Levy and Tsallis distributions. Deep investigation of multivariate non-Gaussian distributions requires the copula approach. A copula is an component of multivariate distribution that models the mutual interdependence between marginals. There are many copula families characterised by various measures of the dependence between marginals. Importantly, one of those are `tail' dependencies that model the simultaneous appearance of extreme values in many marginals. Those extreme events may reflect a crisis given financial data, outliers in machine learning, or a traffic congestion. Next we discuss higher order multivariate cumulants that are non-zero if multivariate distribution is non-Gaussian. However, the relation between cumulants and copulas is not straight forward and rather complicated. We discuss the application of those cumulants to extract information about non-Gaussian multivariate distributions, such that information about non-Gaussian copulas. The use of higher order multivariate cumulants in computer science is inspired by financial data analysis, especially by the safe investment portfolio evaluation. There are many other applications of higher order multivariate cumulants in data engineering, especially in: signal processing, non-linear system identification, blind sources separation, and direction finding algorithms of multi-source signals.