ICST 2025
2025 IEEE Conference on Software Testing, Verification and Validation (ICST)

Powered by

2025 IEEE Conference on Software Testing, Verification and Validation (ICST), March 31 – April 4, 2025, Naples, Italy

ICST 2025 – Preliminary Table of Contents

Contents - Abstracts - Authors

Frontmatter

Title Page

Message from the General Chairs

Info

Message from the Program Co-Chairs

ICST 2025 Organization

Journal First Papers

ICST 2025 Sponsors and Supporters

Technical-Research Track

SPIDER: Fuzzing for Stateful Performance Issues in the ONOS Software-Defined Network Controller
Ao Li, Rohan Padhye, and Vyas Sekar
(Carnegie Mellon University, USA)

Detecting and Evaluating Order-Dependent Flaky Tests in JavaScript
Negar Hashemi, Amjed Tahir, Shawn Rasheed, August Shi, and Rachel Blagojevic
(Massey University, New Zealand; UCOL, New Zealand; University of Texas at Austin, USA)

The Impact of List Reduction for Language Agnostic Test Case Reducers
Tobias Heineken and Michael Philippsen
(Friedrich-Alexander University Erlangen-Nürnberg, Germany)

Hybrid Equivalence/Non-equivalence Testing
Laboni Sarker and Tevfik Bultan
(University of California at Santa Barbara, USA)

Archive submitted (1 GB)

Metamorphic Testing for Pose Estimation Systems
Matias Duran, Thomas Laurent, Ellen Rushe, and Anthony Ventresque
(SFI Lero, Ireland; Trinity College Dublin, Ireland; Dublin City University, Ireland)

Mutation-Based Fuzzing of the Swift Compiler with Incomplete Type Information
Sarah Canto Hyatt and Kyle Dewey
(University of California at Santa Barbara, USA; California State University, USA)

Scalable SMT Sampling for Floating-Point Formulas via Coverage-Guided Fuzzing
Manuel Carrasco, Cristian Cadar, and Alastair F. Donaldson
(Imperial College London, UK)

Info

Turbulence: Systematically and Automatically Testing Instruction-Tuned Large Language Models for Code
Shahin Honarvar, Mark van der Wilk, and Alastair F. Donaldson
(Imperial College London, UK; University of Oxford, UK)

An Empirical Study of Web Flaky Tests: Understanding and Unveiling DOM Event Interaction Challenges
Yu Pei, Jeongju Sohn, and Mike Papadakis
(University of Luxembourg, Luxembourg; Kyungpook National University, South Korea)

Understanding the Effectiveness of Large Language Models in Detecting Security Vulnerabilities
Avishree Khare, Saikat Dutta, Ziyang Li, Alaia Solko-Breslin, Rajeev Alur, and Mayur Naik
(University of Pennsylvania, USA; Cornell University, USA)

ADGE: Automated Directed GUI Explorer for Android Applications
Yue Jiang, Xiaobo Xiang, Qingli Guo, Qi Gong, and Xiaorui Gong
(University of Chinese Academy of Sciences, China; Institute of Information Engineering at Chinese Academy of Sciences, Beijing, China; Singular Security Lab, Beijing, China)

Multi-project Just-in-Time Software Defect Prediction Based on Multi-task Learning for Mobile Applications
Feng Chen, Yuxin Ke, Xin Liu, and Qingjie Wei
(Chongqing University of Posts and Telecommunications, China)

A Taxonomy of Integration-Relevant Faults for Microservice Testing
Lena Gregor, Anja Hentschel, Leon Kastner, and Alexander Pretschner
(TU Munich, Germany; Siemens, Germany)

Benchmarking Image Perturbations for Testing Automated Driving Assistance Systems
Stefano Carlo Lambertenghi, Hannes Leonhard, and Andrea Stocco
(TU Munich, Germany; fortiss, Germany)

Improving the Readability of Automatically Generated Tests using Large Language Models
Matteo Biagiola, Gianluca Ghislotti, and Paolo Tonella
(USI Lugano, Switzerland)

Benchmarking Generative AI Models for Deep Learning Test Input Generation
Maryam Maryam, Matteo Biagiola, Andrea Stocco, and Vincenzo Riccio
(University of Udine, Italy; USI Lugano, Switzerland; TU Munich, Germany; fortiss, Germany)

Challenges, Strategies, and Impacts: A Qualitative Study on UI Testing in CI/CD Processes from GitHub Developers’ Perspectives
Xiaoxiao Gan, Huayu Liang, and Chris Brown
(Virginia Tech, USA)

Coverage Metrics for T-Wise Feature Interactions
Sabrina Böhm, Tim Jannik Schmidt, Sebastian Krieter, Tobias Pett, Thomas Thüm, and Malte Lochau
(University of Ulm, Germany; University of Paderborn, Germany; TU Braunschweig, Germany; Karlsruhe Institute for Technology, Germany; University of Siegen, Germany)

Info

Code, Test, and Coverage Evolution in Mature Software Systems: Changes over the Past Decade
Thomas Bailey and Cristian Cadar
(Imperial College London, UK)

Info

Test Wars: A Comparative Study of SBST, Symbolic Execution, and LLM-Based Approaches to Unit Test Generation
Azat Abdullin, Pouria Derakhshanfar, and Annibale Panichella
(JetBrains Research, Netherlands; Delft University of Technology, Netherlands)

Suspicious Types and Bad Neighborhoods: Filtering Spectra with Compiler Information
Leonhard Applis, Matthías Páll Gissursarson, and Annibale Panichella
(National University of Singapore, Singapore; Chalmers University of Technology, Sweden; Delft University of Technology, Netherlands)

Many-Objective Neuroevolution for Testing Games
Patric Feldmeier, Katrin Schmelz, and Gordon Fraser
(University of Passau, Germany)

Differential Testing of Concurrent Classes
Valerio Terragni and Shing-Chi Cheung
(University of Auckland, New Zealand; Hong Kong University of Science and Technology, China)

On Accelerating Deep Neural Network Mutation Analysis by Neuron and Mutant Clustering
Lauren Lyons and Ali Ghanbari
(Auburn University, USA)

AugmenTest: Enhancing Tests with LLM-Driven Oracles
Shaker Mahmud Khandaker, Fitsum Kifetew, Davide Prandi, and Angelo Susi
(Fondazione Bruno Kessler, Italy)

Info

Testing Practices, Challenges, and Developer Perspectives in Open-Source IoT Platforms
Daniel Rodriguez-Cardenas, Safwat Ali Khan, Prianka Mandal, Adwait Nadkarni, Kevin Moran, and Denys Poshyvanyk
(William & Mary, USA; George Mason University, USA; University of Central Florida, USA)

Impact of Large Language Models of Code on Fault Localization
Suhwan Ji, Sanghwa Lee, Changsup Lee, Yo-Sub Han, and Hyeonseung Im
(Yonsei University, South Korea; Kangwon National University, South Korea)

Benchmarking Open-Source Large Language Models for Log Level Suggestion
Yi Wen Heng, Zeyang Ma, Zhenhao Li, Dong Jae Kim, and Tse-Hsun (Peter) Chen
(Concordia University, Canada; York University, Canada; DePaul University, USA)

Understanding and Enhancing Attribute Prioritization in Fixing Web UI Tests with LLMs
Zhuolin Xu, Qiushi Li, and Shin Hwei Tan
(Concordia University, Canada)

RustyRTS: Regression Test Selection for Rust
Simon Hundsdorfer, Roland Würsching, and Alexander Pretschner
(TU Munich, Germany)

Info

An Analysis of LLM Fine-Tuning and Few-Shot Learning for Flaky Test Detection and Classification
Riddhi More and Jeremy S. Bradbury
(Ontario Tech University, Canada)

On the Energy Consumption of Test Generation
Fitsum Kifetew, Davide Prandi, and Angelo Susi
(Fondazione Bruno Kessler, Italy)

Info

Industry Track

Practical Pipeline-Aware Regression Test Optimization for Continuous Integration
Daniel Schwendner, Maximilian Jungwirth, Martin Gruber, Martin Knoche, Daniel Merget, and Gordon Fraser
(BMW Group, Germany; University of Passau, Germany)

Introducing Black-Box Fuzz Testing for REST APIs in Industry: Challenges and Solutions
Andrea Arcuri, Alexander Poth, and Olsi Rrjolli
(Kristiania University College, Norway; Oslo Metropolitan University, Norway; Volkswagen, Germany)

Integrating LLM-Based Text Generation with Dynamic Context Retrieval for GUI Testing
Juyeon Yoon, Seah Kim, Somin Kim, Sukchul Jung, and Shin Yoo
(KAIST, South Korea; Samsung Research, South Korea)

Assessing the Uncertainty and Robustness of the Laptop Refurbishing Software
Chengjie Lu, Jiahui Wu, Shaukat Ali, and Mikkel Labori Olsen
(Simula Research Laboratory, Norway; University of Oslo, Norway; Danish Technological Institute, Denmark)

Fault Localization via Fine-tuning Large Language Models with Mutation Generated Stack Traces
Neetha Jambigi, Bartosz Bogacz, Moritz Mueller, Thomas Bach, and Michael Felderer
(University of Cologne, Germany; SAP, Germany; DLR, Germany)

LLMs in the Heart of Differential Testing: A Case Study on a Medical Rule Engine
Erblin Isaku, Christoph Laaber, Hassan Sartaj, Shaukat Ali, Thomas Schwitalla, and Jan F. Nygård
(Simula Research Laboratory, Norway; University of Oslo, Norway; Cancer Registry of Norway, Norway; UiT The Arctic University of Norway, Norway)

Compiler Fuzzing in Continuous Integration: A Case Study on Dafny
Karnbongkot Boonriong, Stefan Zetzsche, and Alastair F. Donaldson
(Imperial College London, UK; Amazon, UK)

LLM-Based Labelling of Recorded Automated GUI-Based Test Cases
Diogo Buarque Franzosi, Emil Alégroth, and Maycel Isaac
(Blekinge Institute of Technology, Sweden; Synteda, Sweden)

Info

Taming Uncertainty in Critical Scenario Generation for Testing Automated Driving Systems
Selma Grosse, Adam Molin, Dejan Ničković, Alessio Gambi, and Cristinel Mateis
(DENSO Automotive, Germany; Austrian Institute of Technology, Austria)

ML-Based Test Case Prioritization: A Research and Production Perspective in CI Environments
Md Asif Khan, Akramul Azim, Ramiro Liscano, Kevin Smith, Yee-Kang Chang, Gkerta Seferi, and Qasim Tauseef
(Ontario Tech University, Canada; IBM, United Kingdom; IBM, Canada; IBM, UK)

Evaluation of the Choice of LLM in a Multi-agent Solution for GUI-Test Generation
Stevan Tomic, Emil Alégroth, and Maycel Isaac
(Blekinge Institute of Technology, Sweden; Synteda, Sweden)

Early V&V in Knowledge-Centric Systems Engineering: Advances and Benefits in Practice
Jose Luis de la Vara, Juan Manuel Morote, Clara Ayora, Giovanni Giachetti, Luis Alonso, Roy Mendieta, David Muñoz, Ricardo Ruiz Nolasco, and Antonio González
(University of Castilla-La Mancha, Spain; Independent Researcher, Spain; Universidad de Castilla la Mancha, Spain; Universitat Politecnica de Valencia, Spain; The REUSE Company, Spain; RGB Medical Devices, Spain)

Speculative Testing at Google with Transition Prediction
Avi Kondareddy, Sushmita Azad, Abhayendra Singh, and Tim A. D. Henderson
(Google, USA; Google, UK)

Evaluating Machine Learning-Based Test Case Prioritization in the Real World: An Experiment with SAP HANA
Jeongki Son, Gabin An, Jingun Hong, and Shin Yoo
(SAP Labs, South Korea; KAIST, South Korea)

FuzzE, Development of a Fuzzing Approach for Odoo’s Tours Integration Testing Plateform
Gabriel Benoit, François Georis, Géry Debongnie, Benoît Vanderose, and Xavier Devroey
(University of Namur, Belgium; Odoo, Belgium)

Accessible Smart Contracts Verification: Synthesizing Formal Models with Tamed LLMs
Jan Corazza, Ivan Gavran, Gabriela Moreira, and Daniel Neider
(TU Dortmund, Germany; Informal Systems, Austria; Informal Systems, Brazil)

A Tale from the Trenches: Applying Metamorphic and Differential Testing to Bioinformatics Software
Alexis Marsh, Myra B. Cohen, and Robert Cottingham
(Iowa State University, USA; Oak Ridge National Laboratory, USA)

CubeTesterAI: Automated JUnit Test Generation using the LLaMA Model
Daniele Gorla, Shivam Kumar, Pietro Nicolaus Roselli Lorenzini, and Alireza Alipourfaz
(Sapienza University of Rome, Italy; PCCube, Italy)

Short Papers, Vision, and Emerging Results

Leveraging Large Language Models for Explicit Wait Management in End-to-End Web Testing
Dario Olianas, Maurizio Leotta, and Filippo Ricca
(University of Genoa, Italy)

Weighted Call Frequency-Based Fault Localization
Attila Szatmári, Aondowase James Orban, and Tamás Gergely
(University of Szeged, Hungary)

Addressing Data Leakage in HumanEval using Combinatorial Test Design
Jeremy S. Bradbury and Riddhi More
(Ontario Tech University, Canada)

Towards Cross-Build Differential Testing
Jens Dietrich, Tim White, Valerio Terragni, and Behnaz Hassanshahi
(Victoria University of Wellington, New Zealand; University of Auckland, New Zealand; Oracle Labs, Australia)

Test Generation from Use Case Specifications for IoT Systems: Custom, LLM-Based, and Hybrid Approaches
Zacharie Chenail-Larcher, Jean Baptiste Minani, and Naouel Moha
(ÉTS Montréal, Canada; Concordia University, Canada)

Batch Execution of Microbenchmarks for Efficient Performance Testing
Mostafa Jangali, Kundi Yao, Yiming Tang, Diego Elias Costa, and Weiyi Shang
(University of Waterloo, Canada; Rochester Institute of Technology, USA; Concordia University, Canada)

Pre-trained Models for Bytecode Instructions
Donggyu Kim, Taemin Kim, Ji-ho Shin, Song Wang, Heeyoul Choi, and Jaechang Nam
(Handong Global University, South Korea; York University, Canada)

Info

Towards Refined Code Coverage: A New Predictive Problem in Software Testing
Carolin Brandt and Aurora Ramírez
(Delft University of Technology, Netherlands; University of Córdoba, Spain)

Info

EnCus: Customizing Search Space for Automated Program Repair
Seongbin Kim, Sechang Jang, Jindae Kim, and Jaechang Nam
(Handong Global University, South Korea; Seoul National University of Science and Technology, South Korea)

Info

Harnessing Test Call Structures for Improved Fault Localization Effectiveness
Attila Szatmári
(University of Szeged, Hungary)

Improving the Comprehensibility of Generated Test Suites using Test Case Clustering
Mitchell Olsthoorn
(Delft University of Technology, Netherlands)

Education Track

Black-Box Testing for Practitioners: A Case of the New ISTQB Test Analyst Syllabus
Matthias Hamburg and Adam Roman
(International Software Testing Qualifications Board, Belgium; ISTQB German Testing Board, Germany; Jagiellonian University, Poland; ISTQB Polish Testing Board, Poland)

Combining Logic and Large Language Models for Assisted Debugging and Repair of ASP Programs
Ricardo Brancas, Vasco Manquinho, and Ruben Martins
(INESC-ID, Portugal; Universidade de Lisboa, Portugal; Carnegie Mellon University, USA)

Teaching Bug Advocacy through Flipped Classroom
Andreea Galbin-Nasui and Andreea Vescan
(Babes-Bolyai University, Cluj-Napoca, Romania)

Experience Report on using Experiential Learning to Facilitate Learning of Bug Investigation Steps
Adina Moldovan, Oana Casapu, and Andreea Vescan
(Altom, Romania; Babes-Bolyai University, Cluj-Napoca, Romania)

Requirements for an Automated Assessment Tool for Learning Programming by Doing
Arthur Rump, Vadim Zaytsev, and Angelika Mader
(University of Twente, Netherlands)

Info

A System-Level Testing Framework for Automated Assessment of Programming Assignments Allowing Students Object-Oriented Design Freedom
Valerio Terragni and Nasser Giacaman
(University of Auckland, New Zealand)

Can Test Generation and Program Repair Inform Automated Assessment of Programming Projects?
Ruizhen Gu, José Miguel Rojas, and Donghwan Shin
(University of Sheffield, UK)

A Tool-Assisted Training Approach for Empowering Localization and Internationalization Testing Proficiency
Maria Couto, Breno Miranda, and Kiev Gama
(Federal University of Pernambuco, Brazil)

Posters

Poster: Empirical Evaluation of SC-MCC Meta Program Efficiency using Dynamic Symbolic Execution Engine
Monika Rani Golla and Sangharatna Godboley
(NIT Warangal, India)

Poster: Reporting Unique-Cause MC/DC Score using Formal Verification
Monika Rani Golla, Sangharatna Godboley, Avijit Das, and P. Krishna Radha
(NIT Warangal, India; LRDE DRDO, India)

Poster: Quantification of Feature-Interaction Masking in JHipster
Tim Jannik Schmidt, Sabrina Böhm, Sebastian Krieter, Thomas Thüm, and Mathieu Acher
(University of Ulm, Germany; University of Paderborn, Germany; TU Braunschweig, Germany; Univ Rennes - CNRS - Inria - IRISA - IUF, France)

Poster: Unit Testing Past vs. Present: Examining LLMs' Impact on Defect Detection and Efficiency
Rudolf Ramler, Philipp Straubinger, Reinhold Plösch, and Dietmar Winkler
(Software Competence Center Hagenberg, Austria; University of Passau, Germany; JKU Linz, Austria; Center for Digital Production, Austria; TU Wien, Austria)

Testing Tools and Data Showcase Track

Rocket: A System-Level Fuzz-Testing Framework for the XRPL Consensus Algorithm
Wishaal Kanhai, Ivar van Loon, Yuraj Mangalgi, Thijs van der Valk, Lucas Witte, Annibale Panichella, Mitchell Olsthoorn, and Burcu Kulahcioglu Ozkan
(Delft University of Technology, Netherlands)

Video

Codehacks: A Dataset of Adversarial Tests for Competitive Programming Problems Obtained from Codeforces
Max Hort and Leon Moonen
(Simula Research Laboratory, Norway; BI Norwegian Business School, Norway)

E2E-Loader: A Tool to Generate Performance Tests from End-to-End GUI-Level Tests
Sergio Di Meglio, Luigi Libero Lucio Starace, and Sergio Di Martino
(Federico II University of Naples, Italy)

ViMoTest: A Tool to Specify ViewModel-Based GUI Test Scenarios using Projectional Editing
Mario Fuksa, Sandro Speth, and Steffen Becker
(University of Stuttgart, Germany)

Video

RESTgym: A Flexible Infrastructure for Empirical Assessment of Automated REST API Testing Tools
Davide Corradini, Michele Pasqua, and Mariano Ceccato
(University of Luxembourg, Luxembourg; University of Verona, Italy)

Video

AMBER: AI-Enabled Java Microbenchmark Harness
Antonio Trovato, Luca Traini, Federico Di Menna, and Dario Di Nucci
(University of Salerno, Italy; University of L'Aquila, Italy)

Video

Technical Briefings and Tutorials

Scenario-Based Testing with BeamNG.tech (Hands-On Training)
Chrysanthi Papamichail, David Stark, and Alessio Gambi
(BeamNG, Greece; BeamNG, UK; Austrian Institute of Technology, Austria)

A Developer’s Guide to Building and Testing Accessible Mobile Apps
Juan Pablo Sandoval Alcocer, Leonel Merino, Alison Fernandez-Blanco, William Ravelo-Mendez, Camilo Escobar-Velásquez, and Mario Linares-Vásquez
(Pontificia Universidad Católica de Chile, Chile; Universidad de Los Andes, Colombia)

Doctoral Research

Autonomous Systems

Adversarial Testing with Reinforcement Learning
Andrea Doreste
(USI Lugano, Switzerland)

A Method for Systematically Assessing the Safety of Automated Driving Systems via Simulation
Ali Güllü
(University of Tartu, Estonia)

Uncertainty-Aware Autonomous Driving System Testing with Large Language Models
Jiahui Wu
(Simula Research Laboratory, Norway; University of Oslo, Norway)

Web/Mobile Systems

Identifying and Mitigating Flaky Tests in JavaScript
Negar Hashemi
(Massey University, New Zealand)

End-to-End Testing in Web Environments: Addressing Practical Challenges
Sergio Di Meglio
(Federico II University of Naples, Italy)

Advancing Mobile UI Testing by Learning Screen Usage Semantics
Safwat Ali Khan
(George Mason University, USA)

Debugging and Reliability

Enhancing Spectrum-Based Fault Localization in the Context of Reactive Programming
Aondowase James Orban
(University of Szeged, Hungary)

Toward Tool-Agnostic Guidelines for Expert Debugging Strategies
Homayoun Safarpour
(University of Szeged, Hungary)

On Service-to-Service Integration Testing in Microservice Systems
Lena Gregor
(TU Munich, Germany)

Evaluating Correct-Consistency and Robustness in Code-Generating LLMs
Shahin Honarvar
(Imperial College London, UK)

Tool Competition – Self-Driving Car Testing Track

ICST Tool Competition 2025 – Self-Driving Car Testing Track
Christian Birchler, Stefan Klikovits, Mattia Fazzini, and Sebastiano Panichella
(University of Bern, Switzerland; Zurich University of Applied Sciences, Switzerland; Johannes Kepler University Linz, Austria; University of Minnesota, USA)

DETOUR at the ICST 2025 Tool Competition – Self-Driving Car Testing Track
Paolo Arcaini and Ahmet Cetinkaya
(National Institute of Informatics, Japan; Shibaura Institute of Technology, Japan)

DRVN at the ICST 2025 Tool Competition – Self-Driving Car Testing Track
Antony Bartlett, Cynthia Liem, and Annibale Panichella
(Delft University of Technology, Netherlands)

ITS4SDC at the ICST 2025 Tool Competition – Self-Driving Car Testing Track
Ali Güllü, Faiz Ali Shah, and Dietmar Pfahl
(University of Tartu, Estonia)

CertiFail at the ICST 2025 Tool Competition – Self-Driving Car Testing Track
Fasih Munir Malik and Sajad Mazraeh Khatiri
(University of Bern, Switzerland)

NN-SDCTest at the ICST 2025 Tool Competition – Self-Driving Car Testing Track
Prakash Aryana and Sajad Khatiri
(Birla Institute of Technology and Science, India; University of Bern, Switzerland; USI Lugano, Switzerland; Zurich University of Applied Sciences, Switzerland)

Tool Competition – Unmanned Aerial Vehicles Testing Track

ICST Tool Competition 2025 – UAV Testing Track
Sajad Khatiri, Tahereh Zohdinasab, Prasun Saurabh, Dmytro Humeniuk, and Sebastiano Panichella
(University of Bern, Switzerland; Zurich University of Applied Sciences, Switzerland; USI Lugano, Switzerland; Polytechnique Montréal, Canada)

Evolv-1 at the ICST 2025 Tool Competition – UAV Testing Track
Pietro Lechthaler, Davide Prandi, Fitsum Kifetew, and Angelo Susi
(Fondazione Bruno Kessler, Italy)

TGen-UQ at the ICST 2025 Tool Competition – UAV Testing Track
Ali Javadi and Christian Birchler
(University of Bern, Switzerland; Zurich University of Applied Sciences, Switzerland)

PALM at the ICST 2025 Tool Competition – UAV Testing Track
Shuncheng Tang, Zhenya Zhang, Ahmet Cetinkaya, and Paolo Arcaini
(University of Science and Technology of China, China; Kyushu University, Japan; Shibaura Institute of Technology, Japan; National Institute of Informatics, Japan)

proc time: 0.09