Skip to main content
To KTH's start page

Towards Unsupervised, Analysable and Scalable Node Embedding Models for Transaction Networks

Time: Wed 2025-12-10 09.00

Location: / F3 (Flodis), Lindstedtsvägen 26

Video link: https://kth-se.zoom.us/j/64433421713

Language: English

Subject area: Computer Science

Doctoral student: Ciwan Ceylan , Robotik, perception och lärande, RPL

Opponent: Associate Professor Davide Mottin, Aarhus University

Supervisor: Professor Danica Kragic Jensfelt, Collaborative Autonomous Systems

Export to calendar

QC 20251118

Abstract

The ability to efficiently learn embeddings—low-dimensional vector representations of complex data—has been central to recent advances in machine learning. Network data models, which represent entities (nodes) and their relationships (edges), provide a powerful framework for studying diverse systems, from social interactions and infrastructure to molecular biology. Both research and practical applications have benefited greatly from progress in embedding learning, with node embeddings in particular enabling downstream tasks such as node classification, clustering, anomaly detection, graph alignment, and link prediction.

However, not all network types have seen equal progress. In particular, embedding models for transaction networks—formed by digital payments, transfers, and exchanges—remain underdeveloped, despite their significant potential for applications such as financial crime detection. Several methodological challenges persist in learning node embeddings for transaction networks, as key modalities must be captured while also meeting essential model desiderata. This thesis considers three such desiderata: models should be unsupervised, to address the lack of labelled data; analysable, to ensure interpretability in unsupervised settings; and scalable, to handle the size and complexity of real-world transaction networks.

Guided by these goals, the thesis introduces node embedding models designed to capture three essential transaction network modalities: edge flow, edge directionality, and multi-scale structure. In doing so, it provides both methodological advances and analytical insights. Four key findings are that: (i) it is possible to learn node embeddings that represent transaction flow, something not previously demonstrated; (ii) nodes that only receive transactions (so-called "sinks") degrade embedding quality, but this can be mitigated by combining directed and undirected propagation; (iii) standard message-passing methods can lead to rank deficiency, harming embedding quality, which can be resolved through a new technique called message aggregation; and (iv) embeddings can be made interpretable, with each feature corresponding to a meaningful aspect of the network.

A persistent practical challenge in transaction network research—and a major reason for its limited progress—is the scarcity of accessible datasets, owing to the security and privacy concerns surrounding financial data. This thesis circumvents this issue by focusing on the underlying methodological challenges of node embedding modelling for transaction networks. Extensive empirical evaluations are conducted on both proxy datasets, comprising communication and social networks that share the same key modalities as real-world banking data, and on publicly available cryptocurrency and simulated transaction network datasets, which enable broader validation of the proposed models.

urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-373037