Scalable GPU Performance Variability Analysis Framework
Published in arXiv preprint arXiv:2506.20674, 2025
Recommended citation: Ankur Lahiry, Ayush Pokharel, Seth Ockerman, Amal Gueroudji, Line Pouchard, and Tanzima Z Islam. (2025). "Scalable GPU Performance Variability Analysis Framework." arXiv preprint arXiv:2506.20674. https://arxiv.org/abs/2506.20674
A scalable framework for GPU log-analysis pipelines for large trace datasets using distributed partitioning and parallel processing to reduce analysis time and memory overhead. Achieves a 67% improvement in scalability while enabling fast identification of performance variability, memory stalls, and system bottlenecks across repeated HPC runs.
Authors: Ankur Lahiry, Ayush Pokharel, Seth Ockerman, Amal Gueroudji, Line Pouchard, and Tanzima Z Islam
