Identification of Stochastic Nonlinear Dynamical Models Using Estimating Functions

Time: Fri 2019-09-06 10.00

Location: F3, Lindstedtvägen 26, Stockholm (English)

Subject area: Electrical Engineering

Doctoral student: Mohamed Abdalmoaty , Reglerteknik, KTH Royal Institute of Technology, System Identification

Opponent: Associate Professor Adrian Wills, University of Newcastle, Australia

Supervisor: Håkan Hjalmarsson, Reglerteknik


Data-driven modeling of stochastic nonlinear systems is recognized as a very challenging problem, even when reduced to a parameter estimation problem. A main difficulty is the intractability of the likelihood function, which renders favored estimation methods, such as the maximum likelihood method, analytically intractable. During the last decade, several numerical methods have been developed to approximately solve the maximum likelihood problem. A class of algorithms that attracted considerable attention is based on sequential Monte Carlo algorithms (also known as particle filters/smoothers) and particle Markov chain Monte Carlo algorithms. These algorithms were able to obtain impressive results on several challenging benchmark problems; however, their application is so far limited to cases where fundamental limitations, such as the sample impoverishment and path degeneracy problems, can be avoided.

This thesis introduces relatively simple alternative parameter estimation methods that may be used for fairly general stochastic nonlinear dynamical models. They are based on one-step-ahead predictors that are linear in the observed outputs and do not require the computations of the likelihood function. Therefore, the resulting estimators are relatively easy to compute and may be highly competitive in this regard: they are in fact defined by analytically tractable objective functions in several relevant cases. In cases where the predictors are analytically intractable due to the complexity of the model, it is possible to resort to {plain} Monte Carlo approximations. Under certain assumptions on the data and some conditions on the model, the convergence and consistency of the estimators can be established. Several numerical simulation examples and a recent real-data benchmark problem demonstrate a good performance of the proposed method, in several cases that are considered challenging, with a considerable reduction in computational time in comparison with state-of-the-art sequential Monte Carlo implementations of the ML estimator.

Moreover, we provide some insight into the asymptotic properties of the proposed methods. We show that the accuracy of the estimators depends on the model parameterization and the shape of the unknown distribution of the outputs (via the third and fourth moments). In particular, it is shown that when the model is non-Gaussian, a prediction error method based on the Gaussian assumption is not necessarily more accurate than one based on an optimally weighted parameter-independent quadratic norm. Therefore, it is generally not obvious which method should be used. This result comes in contrast to a current belief in some of the literature on the subject. 

Furthermore, we introduce the estimating functions approach, which was mainly developed in the statistics literature, as a generalization of the maximum likelihood and prediction error methods. We show how it may be used to systematically define optimal estimators, within a predefined class, using only a partial specification of the probabilistic model. Unless the model is Gaussian, this leads to estimators that are asymptotically uniformly more accurate than linear prediction error methods when quadratic criteria are used. Convergence and consistency are established under standard regularity and identifiability assumptions akin to those of prediction error methods.

Finally, we consider the problem of closed-loop identification when the system is stochastic and nonlinear. A couple of scenarios given by the assumptions on the disturbances, the measurement noise and the knowledge of the feedback mechanism are considered. They include a challenging case where the feedback mechanism is completely unknown to the user. Our methods can be regarded as generalizations of some classical closed-loop identification approaches for the linear time-invariant case. We provide an asymptotic analysis of the methods, and demonstrate their properties in a simulation example.