Ph.D. Thesis Colloquium of
Mr. Sobin Joseph

Thesis Supervisor: Prof. Shashi Jain
Date: 8th February 2024 [Thursday]
Time: 11:00 AM
Venue: Seminar Hall [Management Studies]

A Neural Network Approach to Modelling Hawkes Processes for Financial Applications

In 1971, Alan Hawkes introduced a set of stochastic point processes characterised by the unique feature that the incidence of an event heightened the likelihood of subsequent events, leading to the clustering of events. This class of models referred to as the Hawkes Process, has proven to be applicable in various practical contexts. Initially employed in the examination of earthquake sequences, these processes capture the phenomenon where the occurrence of an earthquake raises the probability of subsequent aftershocks. Over time, the Hawkes Process has found widespread use due to its clustering behaviour, particularly in the realm of financial modelling. This thesis explores the modelling of the Hawkes Process for its application in understanding and predicting financial events. Understanding the clustering phenomenon requires knowledge of the intensity function of the Hawkes process, which defines the probability of an event occurring at a given time. In the Hawkes process, the conditional intensity is formulated as the sum of a constant base intensity and kernel functions that model the impact of past arrivals on the current intensity. Estimating the kernel of the Hawkes process becomes crucial for gaining insights into the clustering effect of events, which in turn provides insights into the evolving dynamics of financial processes over time.

Our first objective involves introducing a novel non-parametric estimation for the kernels of linear Hawkes processes, specifically those with excitatory kernels that positively impact the base intensity. This method, named the Shallow Neural Hawkes (SNH) model, approximates the Hawkes kernel as a two-layered feed-forward neural network with a single hidden layer. The approximated kernel function can recover any positive kernel without assuming a parametric form for it.

However, the SNH cannot be used to model Hawkes processes where the kernels are negative or inhibiting in nature. To address this limitation, our second objective introduces the Neural Network for Nonlinear Hawkes (NNNH), which is specifically designed to model the nonlinear Hawkes process. The NNNH is capable of estimating both excitatory and inhibitory kernels. Like the SNH, the NNNH employs a feed-forward neural network to approximate both the base intensity and kernels. However, it adopts a different network architecture than the SNH to ensure compliance with certain properties of the intensity functions when negative kernels are involved. The NNNH model’s flexibility enables it to recover a broad range of stochastic point processes.

Both the SNH and NNNH are suited for the Hawkes process, where the history of the past time arrivals determines the clustering effect. However, there may be cases where the clustering of events is influenced by factors other than past arrivals. This type of Hawkes process, where the likelihood of the arrival of an event is a function of both past arrivals and past marks, is known as the marked Hawkes process. Marks can be any quantity other than time, such as earthquake magnitude, volume or price of market order, and the number of followers on social media. Our final objective targets the non-parametric estimation of kernels for a marked Hawkes process. Here, we present a first-of-its-kind method approximating the marked Hawkes process as a combined function of time and mark.

All three proposed methods are non-parametric methods and rely on a feed-forward neural network for approximating the Hawkes kernel. In each case, the optimal network parameters are obtained by maximising the log-likelihood function of the Hawkes process, providing the best representation of the given Hawkes process. To assess the efficiency of each method, synthetic datasets with known ground truth are used, and the obtained results are compared against those of existing estimation methods. Furthermore, these methods are applied to real-life datasets of cryptocurrency trading, highlighting their practical relevance in real-world scenarios. The evaluation of real datasets reveals that the cluster effects in cryptocurrency trading are best explained by the self-dependency of the events rather than mutual dependency. The proposed neural network-based framework in this thesis has an advantage in that it allows for the recovery of causal relationships in the underlying process, mitigating the disadvantage often associated with the black-box nature of conventional neural network models, a quality particularly desirable in the financial context.