CGO 2025
23rd ACM/IEEE International Symposium on Code Generation and Optimization (CGO 2025)
Powered by
Conference Publishing Consulting

23rd ACM/IEEE International Symposium on Code Generation and Optimization (CGO 2025), March 1–5, 2025, Las Vegas, NV, USA

CGO 2025 – Proceedings

Contents - Abstracts - Authors

Frontmatter

Title Page
Welcome from the General Chairs
Welcome from the Program Chairs
CGO 2025 Organization
CGO 2025 Sponsors and Supporters

Distinguished Papers

Synthesis of Sorting Kernels
Marcel Ullrich and Sebastian Hack
(Saarland University, Germany)
Publisher's Version Published Artifact Artifacts Available Artifacts Reusable Results Reproduced
Tensorize: Fast Synthesis of Tensor Programs from Legacy Code using Symbolic Tracing, Sketching and Solving
Alexander Brauckmann, Luc Jaulmes, José W. de Souza Magalhães, Elizabeth Polgreen, and Michael F. P. O’Boyle
(University of Edinburgh, UK)
Publisher's Version Published Artifact Artifacts Available Artifacts Reusable Results Reproduced
Enhancing Deployment-Time Predictive Model Robustness for Code Analysis and Optimization
Huanting Wang, Patrick Lenihan, and Zheng Wang
(University of Leeds, UK)
Publisher's Version Published Artifact Artifacts Available Artifacts Reusable Results Reproduced

Optimizations and Transformations (1)

SySTeC: A Symmetric Sparse Tensor Compiler
Radha Patel, Willow Ahrens, and Saman Amarasinghe
(Massachusetts Institute of Technology, USA)
Publisher's Version Published Artifact Artifacts Available Artifacts Reusable
Pattern Matching in AI Compilers and Its Formalization
Joseph W. Cutler, Alex Collins, Bin Fan, Mahesh Ravishankar, and Vinod Grover
(University of Pennsylvania, USA; NVIDIA, USA; NVIDIA, UK; AMD, USA)
Publisher's Version
Scalar Interpolation: A Better Balance between Vector and Scalar Execution for SuperScalar Architectures
Reza Ghanbari, Henry Kao, João P. L. De Carvalho, Ehsan Amiri, and J. Nelson Amaral
(University of Alberta, Canada; Huawei Technologies, Canada)
Publisher's Version Published Artifact Artifacts Available Artifacts Functional Results Reproduced

ML Tools and Optimization

VEGA: Automatically Generating Compiler Backends using a Pre-trained Transformer Model
Ming Zhong, Fang Lv, Lulin Wang, Lei Qiu, Yingying Wang, Ying Liu, Huimin Cui, Xiaobing Feng, and Jingling Xue
(Institute of Computing Technology at Chinese Academy of Sciences, China; University of Chinese Academy of Sciences, China; UNSW, Australia)
Publisher's Version Published Artifact Artifacts Available Artifacts Reusable Results Reproduced
IntelliGen: Instruction-Level Auto-tuning for Tensor Program with Monotonic Memory Optimization
Zixuan Ma, Haojie Wang, Jingze Xing, Shuhong Huang, Liyan Zheng, Chen Zhang, Huanqi Cao, Kezhao Huang, Mingshu Zhai, Shizhi Tang, Penghan Wang, and Jidong Zhai
(Tsinghua University, China; Qingcheng.AI, China)
Publisher's Version
GraalNN: Context-Sensitive Static Profiling with Graph Neural Networks
Lazar Milikic, Milan Cugurovic, and Vojin Jovanovic
(Oracle Labs, Switzerland; Oracle Labs, Serbia)
Publisher's Version
LLM-Vectorizer: LLM-Based Verified Loop Vectorizer
Jubi Taneja, Avery Laird, Cong Yan, Madan Musuvathi, and Shuvendu K. Lahiri
(Microsoft Research, USA; University of Toronto, Canada)
Publisher's Version

Architectures and Code Generation

Calibro: Compilation-Assisted Linking-Time Binary Code Outlining for Code Size Reduction in Android Applications
Zhanhao Liang, Hanming Sun, Wenhan Shang, Mengting Yuan, Jingqin Fu, Jiang Ma, Chun Jason Xue, and Qingan Li
(Wuhan University, China; Wuhan Broadcasting and Television Station, China; Guangdong OPPO Mobile Telecommunications, China; MBZUAI, United Arab Emirates)
Publisher's Version
A Multi-level Compiler Backend for Accelerated Micro-kernels Targeting RISC-V ISA Extensions
Alexandre Lopoukhine, Federico Ficarelli, Christos Vasiladiotis, Anton Lydike, Josse Van Delm, Alban Dutilleul, Luca Benini, Marian Verhelst, and Tobias Grosser
(University of Cambridge, UK; University of Bologna, Italy; Cineca, Italy; University of Edinburgh, UK; KU Leuven, Belgium; ENS Rennes, France; ETH Zurich, Switzerland)
Publisher's Version Published Artifact Artifacts Available Artifacts Reusable Results Reproduced
xDSL: Sidekick Compilation for SSA-Based Compilers
Mathieu Fehr, Michel Weber, Christian Ulmann, Alexandre Lopoukhine, Martin Paul Lücke, Théo Degioanni, Christos Vasiladiotis, Michel Steuwer, and Tobias Grosser
(University of Edinburgh, UK; ETH Zurich, Switzerland; University of Cambridge, UK; ENS Rennes, France; Technische Universität Berlin, Germany)
Publisher's Version Published Artifact Artifacts Available Artifacts Reusable Results Reproduced

ML Compilers

ANT-ACE: An FHE Compiler Framework for Automating Neural Network Inference
Long Li, Jianxin Lai, Peng Yuan, Tianxiang Sui, Yan Liu, Qing Zhu, Xiaojing Zhang, Linjie Xiao, Wenguang Chen, and Jingling Xue
(Ant Group, China; Tsinghua University, China; UNSW, Australia; Ant Group, Australia)
Publisher's Version Published Artifact Artifacts Available Artifacts Reusable Results Reproduced
CUrator: An Efficient LLM Execution Engine with Optimized Integration of CUDA Libraries
Yoon Noh Lee, Yongseung Yu, and Yongjun Park
(Yonsei University, South Korea)
Publisher's Version Published Artifact Artifacts Available Artifacts Reusable Results Reproduced
Accelerating LLMs using an Efficient GEMM Library and Target-Aware Optimizations on Real-World PIM Devices
Hyeoncheol Kim, Taehoon Kim, Taehyeong Park, Donghyeon Kim, Yongseung Yu, Hanjun Kim, and Yongjun Park
(Yonsei University, South Korea; Rebellions, South Korea; Hanyang University, South Korea)
Publisher's Version

MLIR

The MLIR Transform Dialect: Your Compiler Is More Powerful Than You Think
Martin Paul Lücke, Oleksandr Zinenko, William S. Moses, Michel Steuwer, and Albert Cohen
(University of Edinburgh, UK; Google DeepMind, France; University of Illinois at Urbana-Champaign, USA; Google DeepMind, USA; Technische Universität Berlin, Germany)
Publisher's Version Published Artifact Info Artifacts Available Artifacts Reusable Results Reproduced
Combining MLIR Dialects with Domain-Specific Architecture for Efficient Regular Expression Matching
Andrea Somaini, Filippo Carloni, Giovanni Agosta, Marco D. Santambrogio, and Davide Conficconi
(Politecnico di Milano, Italy)
Publisher's Version Published Artifact Artifacts Available Artifacts Reusable Results Reproduced
DialEgg: Dialect-Agnostic MLIR Optimizer using Equality Saturation with Egglog
Abd-El-Aziz Zayed and Christophe Dubach
(McGill University, Canada; Mila, Canada)
Publisher's Version Published Artifact Artifacts Available Artifacts Reusable Results Reproduced

Quantum Computing (1)

Synthesis of Quantum Simulators by Compilation
Meisam Tarabkhah, Mahshid Delavar, Mina Doosti, and Amir Shaikhha
(University of Edinburgh, UK; University of Sheffield, UK)
Publisher's Version
Weaver: A Retargetable Compiler Framework for FPQA Quantum Architectures
Oğuzcan Kırmemiş, Francisco Romão, Emmanouil Giortamis, and Pramod Bhatotia
(TU Munich, Germany)
Publisher's Version Published Artifact Artifacts Available Artifacts Reusable Results Reproduced

Program Analysis and Synthesis

Automatic Synthesis of Specialized Hash Functions
Renato B. Hoffmann, Leonardo G. Faé, Dalvan Griebler, Xinliang David Li, and Fernando Magno Quintão Pereira
(PUC-RS, Brazil; Google, USA; Federal University of Minas Gerais, Brazil)
Publisher's Version Published Artifact Artifacts Available Artifacts Reusable Results Reproduced
Stack Filtering: Elevating Precision and Efficiency in Rust Pointer Analysis
Wei Li, Dongjie He, Wenguang Chen, and Jingling Xue
(UNSW, Australia; Chongqing University, China; Tsinghua University, China)
Publisher's Version Published Artifact Artifacts Available Artifacts Reusable Results Reproduced
SkipFlow: Improving the Precision of Points-to Analysis using Primitive Values and Predicate Edges
David Kozak, Codrut Stancu, Tomáš Vojnar, and Christian Wimmer
(Oracle Labs, Czechia; Brno University of Technology, Czechia; Oracle Labs, Switzerland; Masaryk University, Czechia; Oracle Labs, USA)
Publisher's Version Published Artifact Artifacts Available Artifacts Reusable Results Reproduced

Safety and Resilience

FastFlip: Compositional SDC Resiliency Analysis
Keyur Joshi, Rahul Singh, Tommaso Bassetto, Sarita Adve, Darko Marinov, and Sasa Misailovic
(University of Illinois at Urbana-Champaign, USA)
Publisher's Version
MTE4JNI: A Memory Tagging Method to Protect Java Heap Memory from Illicit Native Code Access
Huinan Chen, Jiang Ma, Chun Jason Xue, and Qingan Li
(Wuhan University, China; Guangdong OPPO Mobile Telecommunications, China; MBZUAI, United Arab Emirates)
Publisher's Version
Memory Safety Instrumentations in Practice: Usability, Performance, and Security Guarantees
Tina Jung, Fabian Ritter, and Sebastian Hack
(Saarland University, Germany)
Publisher's Version Published Artifact Info Artifacts Available

Optimizations and Transformations (2)

PreFix: Optimizing the Performance of Heap-Intensive Applications
Chaitanya Mamatha Ananda, Rajiv Gupta, Sriraman Tallam, Han Shen, and Xinliang David Li
(University of California at Riverside, USA; Google, USA)
Publisher's Version
A Priori Loop Nest Normalization: Automatic Loop Scheduling in Complex Applications
Lukas Trümper, Philipp Schaad, Berke Ates, Alexandru Calotoiu, Marcin Copik, and Torsten Hoefler
(Daisytuner, Germany; ETH Zurich, Switzerland)
Publisher's Version
An Efficient Polynomial Multiplication Derived Implementation of Convolution in Neural Networks
Haoke Xu, Yulin Zhang, Zitong Cheng, and Xiaoming Li
(University of Delaware, USA; Minzu University of China, China)
Publisher's Version

Quantum Computing (2)

ASDF: A Compiler for Qwerty, a Basis-Oriented Quantum Programming Language
Austin J. Adams, Sharjeel Khan, Arjun S. Bhamra, Ryan R. Abusaada, Anthony M. Cabrera, Cameron C. Hoechst, Travis S. Humble, Jeffrey S. Young, and Thomas M. Conte
(Georgia Institute of Technology, USA; Oak Ridge National Laboratory, USA)
Publisher's Version Published Artifact Info Artifacts Available Artifacts Reusable Results Reproduced
Qubit Movement-Optimized Program Generation on Zoned Neutral Atom Processors
Enhyeok Jang, Youngmin Kim, Hyungseok Kim, Seungwoo Choi, Yipeng Huang, and Won Woo Ro
(Yonsei University, South Korea; Rutgers University, USA)
Publisher's Version

GPU and Parallelism

Code Generation for Cryptographic Kernels using Multi-word Modular Arithmetic on GPU
Naifeng Zhang and Franz Franchetti
(Carnegie Mellon University, USA)
Publisher's Version Published Artifact Artifacts Available Artifacts Functional Results Reproduced
CuAsmRL: Optimizing GPU SASS Schedules via Deep Reinforcement Learning
Guoliang He and Eiko Yoneki
(University of Cambridge, UK)
Publisher's Version Published Artifact Artifacts Available Artifacts Reusable Results Reproduced
Proteus: Portable Runtime Optimization of GPU Kernel Execution with Just-in-Time Compilation
Giorgis Georgakoudis, Konstantinos Parasyris, and David Beckingsale
(Lawrence Livermore National Laboratory, USA)
Publisher's Version Published Artifact Artifacts Available Artifacts Reusable Results Reproduced

Security, Fault Tolerance, and Cryptography

Qiwu: Exploiting Ciphertext-Level SIMD Parallelism in Homomorphic Encryption Programs
Zhongcheng Zhang, Ying Liu, Yuyang Zhang, Zhenchuan Chen, Jiacheng Zhao, Xiaobing Feng, Huimin Cui, and Jingling Xue
(Institute of Computing Technology at Chinese Academy of Sciences, China; University of Chinese Academy of Sciences, China; Zhongguancun Laboratory, China; UNSW, Australia)
Publisher's Version Published Artifact Artifacts Available Artifacts Functional Results Reproduced
Cage: Hardware-Accelerated Safe WebAssembly
Martin Fink, Dimitrios Stavrakakis, Dennis Sprokholt, Soham Chakraborty, Jan-Erik Ekberg, and Pramod Bhatotia
(TU Munich, Germany; TU Delft, Netherlands; Huawei, Finland)
Publisher's Version Published Artifact Artifacts Available Artifacts Reusable Results Reproduced
Teapot: Efficiently Uncovering Spectre Gadgets in COTS Binaries
Fangzheng Lin, Zhongfa Wang, and Hiroshi Sasaki
(Institute of Science Tokyo, Japan)
Publisher's Version Published Artifact Artifacts Available Artifacts Reusable
Janitizer: Rethinking Binary Tools for Practical and Comprehensive Security
Mahwish Arif, Sam Ainsworth, and Timothy M. Jones
(University of Cambridge, UK; University of Edinburgh, UK)
Publisher's Version
Parallaft: Runtime-Based CPU Fault Tolerance via Heterogeneous Parallelism
Boyue Zhang, Sam Ainsworth, Lev Mukhanov, and Timothy M. Jones
(University of Cambridge, UK; University of Edinburgh, UK; Queen Mary University of London, UK)
Publisher's Version Published Artifact Artifacts Available Artifacts Reusable Results Reproduced

Optimizations and Transformations (3)

Postiz: Extending Post-increment Addressing for Loop Optimization and Code Size Reduction
Enming Fan, Xiaofeng Guan, Fan Hu, Heng Shi, Hao Zhou, and Jianguo Yao
(Shanghai Enflame Technology, China; Shanghai Jiao Tong University, China)
Publisher's Version
Towards Efficient Compiler Auto-tuning: Leveraging Synergistic Search Spaces
Haolin Pan, Yuanyu Wei, Mingjie Xing, Yanjun Wu, and Chen Zhao
(Institute of Software at Chinese Academy of Sciences, China; Hangzhou Institute for Advanced Study at University of Chinese Academy of Sciences, China; University of Chinese Academy of Sciences, China)
Publisher's Version Published Artifact Artifacts Available Artifacts Reusable Results Reproduced
Stardust: Compiling Sparse Tensor Algebra to a Reconfigurable Dataflow Architecture
Olivia Hsu, Alexander Rucker, Tian Zhao, Varun Desai, Kunle Olukotun, and Fredrik Kjolstad
(Stanford University, USA)
Publisher's Version
Vectron: A Dynamic Programming Auto-vectorization Framework
Sourena Naser Moghaddasi, Haris Smajlović, Ariya Shajii, and Ibrahim Numanagić
(University of Victoria, Canada; Exaloop, USA)
Publisher's Version Published Artifact Artifacts Available Artifacts Functional

Runtime and System Tools

Honey Potion: An eBPF Backend for Elixir
Kael Soares Augusto, Vinícius Pacheco, Marcos A. Vieira, Rodrigo Geraldo Ribeiro, and Fernando Magno Quintão Pereira
(Federal University of Minas Gerais, Brazil; Cadence, Brazil; Federal University of Ouro Preto, Brazil)
Publisher's Version Published Artifact Artifacts Available Artifacts Reusable Results Reproduced
GoFree: Reducing Garbage Collection via Compiler-Inserted Freeing
Haoran Peng, Yu Zhang, Michael D. Ernst, Jinbao Chen, and Boyao Ding
(University of Science and Technology of China, China; University of Washington, USA)
Publisher's Version Published Artifact Artifacts Available Artifacts Reusable Results Reproduced
Improving Native-Image Startup Performance
Matteo Basso, Aleksandar Prokopec, Andrea Rosà, and Walter Binder
(USI Lugano, Switzerland; Oracle Labs, Switzerland)
Publisher's Version Published Artifact Artifacts Available Artifacts Reusable Results Reproduced
Speeding up the Local C++ Development Cycle with Header Substitution
Nader Al Awar, Zijian Yi, George Biros, and Milos Gligoric
(University of Texas at Austin, USA)
Publisher's Version Artifacts Functional Results Reproduced

proc time: 8.62