Jaewoong Sim

Students

I am fortunate to work with talented and hardworking students. :D

MS/PhD

● Junseo Lee (Spring 2022 -); Past: B.S. in ECE, Seoul National University
● Kwanseok Choi (Spring 2022 -); Past: B.S. in ECE, Seoul National University
● Wonbeom Lee (Spring 2023 -); Past: B.S. in ECE, Seoul National University
● Jungi Lee (Spring 2023 -); Past: B.S. in ECE, Seoul National University
● Seokwon Lee (Spring 2023 -); Past: B.S. in ECE, Seoul National University
● Jaehoon Cho (Spring 2023 -); Past: B.S. in ECE, Seoul National University
● Soohyun Cha (Spring 2024 -); Past: B.S. in ECE, Seoul National University
● Junyong Park (Spring 2024 -); Past: B.S. in ECE, Seoul National University
● Sangyun Jeon (Spring 2024 -); Past: B.S. in ECE, Seoul National University

Alumni

● Joonho Whangbo (2021-2022; Undergraduate Researcher); Now: PhD at UC Berkeley

Publications

HPCA'26 GRTX: Efficient Ray Tracing for 3D Gaussian-Based Rendering
Junseo Lee, Sangyun Jeon, Jungi Lee, Junyong Park, Jaewoong Sim
Proc. of the 32nd International Symposium on High Performance Computer Architecture (HPCA), Sydney, Australia, Jan 2026

MICRO-58 MX+: Pushing the Limits of Microscaling Formats for Efficient Large Language Model Serving
Jungi Lee, Junyong Park, Soohyun Cha, Jaehoon Cho, Jaewoong Sim
Proc. of the 58th International Symposium on Microarchitecture (MICRO), Seoul, Korea, Oct 2025

HPCA'25 VR-Pipe: Streamlining Hardware Graphics Pipeline for Volume Rendering
Junseo Lee, Jaisung Kim, Junyong Park, Jaewoong Sim
Proc. of the 31st International Symposium on High Performance Computer Architecture (HPCA), Las Vegas, NV, March 2025

OSDI'24 InfiniGen: Efficient Generative Inference of Large Language Models with Dynamic KV Cache Management
Wonbeom Lee†, Jungi Lee†, Junghwan Seo, Jaewoong Sim
Proc. of the 18th USENIX Symposium on Operating Systems Design and Implementation (OSDI), Santa Clara, CA, July 2024
[ code] [ talk]

ISCA-51 Tender: Accelerating Large Language Models via Tensor Decompostion and Runtime Requantization
Jungi Lee†, Wonbeom Lee†, Jaewoong Sim
Proc. of the 51st International Symposium on Computer Architecture (ISCA), Buenos Aires, Argentina, June 2024
[ code] [ talk]

DAC-61 MoNDE: Mixture of Near-Data Experts for Large-Scale Sparse Models
Taehyun Kim, Kwanseok Choi, Youngmock Cho, Jaehoon Cho, Hyuk-Jae Lee, Jaewoong Sim
Proc. of the 61st Design Automation Conference (DAC), San Francisco, CA, June 2024

TODAES CuPBoP: Making CUDA a Portable Language
Ruobing Han, Jun Chen, Bhanu Garg, Xule Zhou, John Lu, Jeffrey Young, Jaewoong Sim, Hyesoon Kim
ACM Transactions on Design Automation of Electronic Systems (TODAES), June 2024

ASPLOS'24 GSCore: Efficient Radiance Field Rendering via Architectural Support for 3D Gaussian Splatting
Junseo Lee, Seokwon Lee, Jungi Lee, Junyong Park, Jaewoong Sim
Proc. of the 2024 International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), San Diego, CA, April 2024
[Lighting Talk]

PACT-32 SDM: Sharing-Enabled Disaggregated Memory System with Cache Coherent Compute Express Link
Hyokeun Lee, Kwanseok Choi, Hyuk Jae Lee, Jaewoong Sim
Proc. of the 32nd International Conference on Parallel Architectures and Compilation Techniques (PACT), Vienna, Austria, Oct 2023
[Slides]

ISCA-50 NeuRex: A Case for Neural Rendering Acceleration
Junseo Lee, Kwanseok Choi, Jungi Lee, Seokwon Lee, Joonho Whangbo, Jaewoong Sim
Proc. of the 50th International Symposium on Computer Architecture (ISCA), Orlando, FL, June 2023
[Lighting Talk]

PPoPP'23 CuPBoP: A Framework to Make CUDA Portable
Ruobing Han, Jun Chen, Bhanu Garg, Jeffrey Young, Jaewoong Sim, Hyesoon Kim
Proc. of the 28th International Symposium on Principles and Practice of Parallel Programming (PPoPP) - Poster, Montreal, Canada, Feb 2023

TACO COX: Exposing CUDA Warp-Level Functions to CPUs
Ruobing Han, Jaewon Lee, Jaewoong Sim, Hyesoon Kim
ACM Transactions on Architecture and Code Optimization (TACO), September 2022

TRETS Specializing FGPU for Persistent Deep Learning
Rui Ma, Jia-Ching Hsu, Tian Tan, Eriko Nurvitadhi, David Sheffield, Rob Pelt, Martin Langhammer, Jaewoong Sim, Aravind Dasu and Derek Chiou
ACM Transactions on Reconfigurable Technology and Systems (TRETS), June 2021

ASPLOS'20 Batch-Aware Unified Memory Management in GPUs for Irregular Workloads
Hyojong Kim, Jaewoong Sim, Prasun Gera, Ramyad Hadidi, Hyesoon Kim
Proc. of the 2020 International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Lausanne, Switzerland, March 2020

JPDC Thermal-Aware Processing-in-Memory Instruction Offloading
Lifeng Nai, Ramyad Hadidi, He Xiao, Hyojong Kim, Jaewoong Sim, Hyesoon Kim
Jounal of Parallel and Distributed Computing (JPDC) 2019

FCCM'19 Why Compete When You Can Work Together: FPGA-ASIC Integration for Persistent RNNs
Eriko Nurvitadhi, Dongup Kwon, Ali Jafari, Andrew Boutros, Jaewoong Sim, Phillip Tomson, Huseyin Sumbul, Gregory Chen, Phil Knag, Raghavan Kumar, Ram Krishnamurthy and Debbie Marr
Proc. of the 27th IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM), San Diego, California, Apr 2019.

IPDPS'18 CoolPIM: Thermal-Aware Source Throttling for Efficient PIM Instruction Offloading
Lifeng Nai, Ramyad Hadidi, He Xiao, Hyojong Kim, Jaewoong Sim, Hyesoon Kim
Proc. of the 32nd IEEE International Parallel and Distributed Processing Symposium (IPDPS), Vancouver, BC, Canada, May 2018

FPGA'18 A Customizable Matrix Multiplication Framework for the Intel HARPv2 Xeon+FPGA Platform: A Deep Learning Case Study
Duncan Moss, Srivatsan Krishnan, Eriko Nurvitadhi, Piotr Ratuszniak, Chris Johnson, Jaewoong Sim, Asit Mishra, Debbie Marr, Suchit Subhaschandra, Philip H.W. Leong
Proc. of the 26th ACM International Symposium on Field-Programmable Gate Arrays (FPGA), Monterey, California, Feb 2018

FPL-27 High Performance Binary Neural Networks on the Xeon+FPGA Platform
Duncan Moss, Eriko Nurvitadhi, Jaewoong Sim, Asit Mishra, Debbie Marr, Suchit Subhaschandra, Philip H.W. Leong
Proc. of the 27th International Conference on Field-Programmable Logic and Applications (FPL), Ghent, Belgium, Sep 2017

FPGA'17 Can FPGAs Beat GPUs in Accelerating Next-Generation Deep Neural Networks?
Eriko Nurvitadhi, Ganesh Venkatesh, Jaewoong Sim, Debbie Marr, Randy Huang, Jason Gee Hock Ong, Yeong Tat Liew, Srivatsan Krishnan, Duncan Moss, Suchit Subhaschandra, Guy Boudoukh
Proc. of the 25th ACM International Symposium on Field-Programmable Gate Arrays (FPGA), Monterey, California, Feb 2017 (Covered in the news by the Next Platform)
[PDF]

HPCA-23 GraphPIM: Enabling Instruction-Level PIM Offloading in Graph Computing Frameworks
Lifeng Nai, Ramyad Hadidi, Jaewoong Sim, Hyojong Kim, Pranith Kumar, Hyesoon Kim
Proc. of the 23rd International Symposium on High Performance Computer Architecture (HPCA), Austin, TX, Feb 2017
[PDF]

FPT'16 Accelerating Binarized Neural Networks: Comparison of FPGA, CPU, GPU, and ASIC
Eriko Nurvitadhi, David Sheffield, Jaewoong Sim, Asit Mishra, Ganesh Venkatesh, Debbie Marr
Proc. of the 2016 International Conference on Field-Programmable Technology (FPT), Xi'an, China, Dec 2016
[PDF]

FPL-26 Accelerating Recurrent Neural Networks in Analytics Servers: Comparison of FPGA, CPU, GPU, and ASIC
Eriko Nurvitadhi, Jaewoong Sim, David Sheffield, Asit Mishra, Srivatsan Krishnan, Debbie Marr
Proc. of the 26th International Conference on Field-Programmable Logic and Applications (FPL), Lausanne, Switzerland, Aug 2016
[PDF]

PACT-24 BSSync: Processing Near Memory for Machine Learning Workloads with Bounded Staleness Consistency Models
Joo Hwan Lee, Jaewoong Sim, Hyesoon Kim
Proc. of the 24th International Conference on Parallel Architectures and Compilation Techniques (PACT), San Francisco, CA, Oct 2015 (Best Paper Award)
[PDF] [BibTex]

@inproceedings{lee-pact2015,
  author    = {Joo Hwan Lee and Jaewoong Sim and Hyesoon Kim},
  title     = {BSSync: Processing Near Memory for Machine Learning Workloads with Bounded Staleness Consistency Models},
  booktitle = {Proceedings of the 24th International Conference on Parallel Architectures and Compilation Techniques},
  series    = {PACT-24},
  year      = {2015},
}

MICRO-47 Transparent Hardware Management of Stacked DRAM as Part of Memory
Jaewoong Sim, Alaa R. Alameldeen, Zeshan Chishti, Chris Wilkerson, Hyesoon Kim
Proc. of the 47th International Symposium on Microarchitecture (MICRO), Cambridge, UK, Dec 2014
[PDF] [Slides] [Poster] [BibTex]

@inproceedings{sim-micro2014,
  author    = {Jaewoong Sim and Alaa R. Alameldeen and Zeshan Chishti and Chris Wilkerson and Hyesoon Kim},
  title     = {Transparent Hardware Management of Stacked DRAM as Part of Memory},
  booktitle = {Proceedings of the 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture},
  series    = {MICRO-47},
  year      = {2014},
}

TOPPICKS'14 A Configurable and Strong RAS Solution for Die-Stacked DRAM Caches
Jaewoong Sim, Gabriel H. Loh, Vilas Sridharan, Mike O'Connor
IEEE Micro, Special Issue: Micro's Top Picks from 2013 Computer Architecture Conferences (MICRO TOP PICKS), May/June 2014
[Paper] [BibTex]

@article{sim-toppicks2014, 
  author = {Jaewoong Sim and Gabriel H. Loh and Vilas Sridharan and Mike O'Connor},
  journal={Micro, IEEE}, 
  title={A Configurable and Strong RAS Solution for Die-Stacked DRAM Caches}, 
  year={2014}, 
  month={May/June}, 
  volume={34}, 
  number={3}, 
  pages={80-90}, 
}

ISCA-40 Resilient Die-stacked DRAM Caches
Jaewoong Sim, Gabriel H. Loh, Vilas Sridharan, Mike O'Connor
Proc. of the 40th International Symposium on Computer Architecture (ISCA), Tel-Aviv, Israel, June 2013
One of the 12 computer architecture papers of 2013 selected as Top Picks by IEEE MICRO
[PDF] [Slides] [BibTex]

@inproceedings{sim-isca2013,
 author = {Jaewoong Sim and Gabriel H. Loh and Vilas Sridharan and Mike O'Connor},
 title = {Resilient Die-stacked DRAM Caches},
 booktitle = {Proceedings of the 40th Annual International Symposium on Computer Architecture},
 series = {ISCA-40},
 year = {2013},
 location = {Tel-Aviv, Israel},
 pages = {416--427},
 publisher = {ACM},
 address = {New York, NY, USA},
}

MICRO-45 A Mostly-Clean DRAM Cache for Effective Hit Speculation and Self-Balancing Dispatch
Jaewoong Sim, Gabriel H. Loh, Hyesoon Kim, Mike O'Connor, Mithuna Thottethodi
Proc. of the 45th International Symposium on Microarchitecture (MICRO), Vancouver, BC, Canada, Dec 2012
[PDF] [Slides] [Poster] [BibTex]

@inproceedings{sim-micro2012,
  author = {Jaewoong Sim and Gabriel H. Loh and Hyesoon Kim and Mike O'Connor and Mithuna Thottethodi},
  title = {A Mostly-Clean DRAM Cache for Effective Hit Speculation and Self-Balancing Dispatch},
  booktitle = {Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture},
  series = {MICRO-45},
  year = {2012},
  location = {Vancouver, B.C., CANADA},
  pages = {247--257},
  numpages = {11},
  publisher = {IEEE Computer Society},
  address = {Washington, DC, USA},
}

ISCA-39 FLEXclusion: Balancing Cache Capacity and On-Chip Bandwidth via Flexible Exclusion
Jaewoong Sim, Jaekyu Lee, Moinuddin K. Qureshi, Hyesoon Kim
Proc. of the 39th International Symposium on Computer Architecture (ISCA), Portland, OR, June 2012
[PDF] [Slides] [BibTex]

@inproceedings{sim-isca2012,
  author = {Jaewoong Sim and Jaekyu Lee and Moinuddin K. Qureshi and Hyesoon Kim},
  title = {FLEXclusion: Balancing Cache Capacity and On-chip Bandwidth via Flexible Exclusion},
  booktitle = {Proceedings of the 39th Annual International Symposium on Computer Architecture},
  series = {ISCA-39},
  year = {2012},
  location = {Portland, Oregon},
  pages = {321--332},
  publisher = {IEEE Computer Society},
  address = {Washington, DC, USA},
}

PPoPP'12 A Performance Analysis Framework for Identifying Potential Benefits in GPGPU Applications
Jaewoong Sim, Aniruddha Dasgupta, Hyesoon Kim, Richard Vuduc
Proc. of the 17th International Symposium on Principles and Practice of Parallel Programming (PPoPP), New Orleans, LA, Feb 2012
[PDF] [Slides] [BibTex]

@inproceedings{sim-ppopp2012,
  author    = {Jaewoong Sim and Aniruddha Dasgupta and Hyesoon Kim and Richard Vuduc},
  title     = {A Performance Analysis Framework for Identifying Potential Benefits in GPGPU Applications},
  booktitle = {Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming},
  series    = {PPoPP '12},
  year      = {2012},
  pages     = {11--22},
}