Synthesizing weld pool dynamics via VAE-GAN to enhance human control performance

Research output: Contribution to journalArticlepeer-review

Abstract

A process is not truly robotized unless the robot can adaptively respond to it—yet such adaptability varies across processes and often demands application-specific engineering. Learning from Human Operator (HO) who views and controls the welding process via a Human-Robot Interface (HRI) provides a more generalizable approach to acquire the needed adaptation. A fundamental challenge is how the HO we wish to learn from can perform beyond his/her capability limitations. To address this challenge, we propose predicting and rendering the future outcomes to the HO, resulting in predictive human control which is expected to perform better than conventional reactive human control. However, the performance enhancement critically depends on the prediction's accuracy and interpretability by the HO. In welding processes, such prediction entails generating complex, dynamic weld pool images from welding parameters. This task requires not only high-fidelity synthesis to capture key process features but also controlled generation to preserve temporal dependencies with parameters. The predictive model must replicate the unknown dynamics mapping welding parameters to the visual evolution of the weld pool. We propose a deep generative framework for Gas Tungsten Arc Welding (GTAW) consisting of three interconnected stages. First, a Beta Total Correlation Variational Autoencoder (β-TC VAE) compresses weld pool images into a low-dimensional latent space capturing dominant features. Next, a Long Short-Term Memory (LSTM) network models the temporal evolution of latent codes based on welding parameter variations. Finally, a PatchGAN refines the VAE reconstructions, restoring fine-grained stochastic details to enhance visual realism. By combining latent representation learning, temporal modeling, and adversarial refinement, our framework enables realistic, controlled synthesis of dynamic weld pools. This approach is extendable to other manufacturing processes with strong parameter dependencies.

Original languageEnglish
Pages (from-to)98-107
Number of pages10
JournalJournal of Manufacturing Processes
Volume155
DOIs
StatePublished - Dec 12 2025

Bibliographical note

Publisher Copyright:
© 2025 The Society of Manufacturing Engineers

Funding

This work is partially funded by the National Science Foundation under grants CMMI-2024614 and IIS-2327113, and the Department of Electrical and Computer Engineering, the Institute for Sustainable Manufacturing and Department of Mathematics of the University of Kentucky, Lexington, Kentucky, USA.

FundersFunder number
Department of Electrical and Computer Engineering, Western Michigan University
National Science Foundation Arctic Social Science ProgramCMMI-2024614, IIS-2327113

    Keywords

    • Deep learning
    • GTAW
    • Generative AI
    • Generative Adversarial Network (GAN)
    • Long Short-Term Memory (LSTM)
    • Variational Autoencoder (VAE)

    ASJC Scopus subject areas

    • Strategy and Management
    • Management Science and Operations Research
    • Industrial and Manufacturing Engineering

    Fingerprint

    Dive into the research topics of 'Synthesizing weld pool dynamics via VAE-GAN to enhance human control performance'. Together they form a unique fingerprint.

    Cite this