Towards Automated Parts Recognition in Manufacturing with Synthetic Data
Time: Tue 2025-12-09 13.00
Location: Kollegiesalen, Brinellvägen 8, Stockholm
Video link: https://kth-se.zoom.us/j/62913476523
Language: English
Doctoral student: Xiaomeng Zhu , Robotik, perception och lärande, RPL
Opponent: Professor Tomohiko Sakao, Linköping University, Linköping, Sweden
Supervisor: Professor Atsuto Maki, Robotik, perception och lärande, RPL; Professor Mårten Björkman, Robotik, perception och lärande, RPL; Doctor Pär Mårtensson, Scania CV AB, Sweden; Professor Lars Hanson, University of Skövde, Skövde, Sweden
QC 20251112
Abstract
This thesis advances the understanding and application of synthetic data for manufacturing parts recognition. Vision-based inspection systems in manufacturing rely heavily on real image data, which are costly to collect, annotate, and adapt across products and environments. To address these challenges, this work presents a systematic investigation of how synthetic data can be effectively generated, evaluated, and applied for robust and scalable performance. The research introduces a series of new industrial benchmark datasets covering multiple manufacturing use cases and factory environments: SIP-17, SIP15-OD, and SIP2A-OD, to enable unified evaluation of sim-to-real transfer in classification and detection tasks. Building on these datasets, a domain randomization pipeline is developed to systematically explore the effects of rendering parameters, material variability, and illumination on model generalization. To further automate data generation, the thesis proposes Synthetic Active Learning (SAL), a closed-loop framework that identifies model weaknesses and adaptively refines synthetic data generation without requiring real samples or manual tuning. Experiments across the benchmark datasets show that the proposed method improves model robustness compared to existing approaches while reducing manual labeling requirements. Collectively, these contributions provide new insights into how synthetic data can be systematically leveraged to build data-efficient, automated, and reliable vision systems for manufacturing, aiming to support future development of intelligent and flexible production systems.