About me

조현제
The Principle of Growth

조현제 Hyunje Jo

Hardware Architect / Full-Stack Engineer
Zaleph (0): Represents my pursuit of 'Countable Infinity'.

한정된 하드웨어 자원 위에서 무한한 논리적 가능성을 설계합니다.

Starting from a question about natural numbers and infinity in a library, I have grown into an engineer who seeks the fundamental principles behind technology. I find my greatest motivation in the joy of learning and the correspondence between theory and real-world implementation.


Get in touch

다양한 방식으로 저와 연락할 수 있습니다.

✉️ Email

a@hj4.it

💬 KakaoTalk

aabbxx


✈️ Telegram

@hj4it

📷 Instagram

@acos99

PGP Public Key

안전한 통신을 위해 다음 PGP 공개 키를 사용합니다.

-----BEGIN PGP PUBLIC KEY BLOCK-----
mDMEaO0QbhYJKwYBBAHaRw8BAQdAdLjOjbCunshoKBEs/eIEBdJbFXMd24mf6qke
94R7YK20FWFsZXBoXzM2MTEgPGFAaGo0Lml0PohyBBMWCAAaBAsJCAcCFQgCFgEC
GQEFgmjtEG4CngECmwMACgkQGUuxXPxxLpgAFQD+JAGV7xNNm8TeqtkT2FET67Ea
+WiWQUJiFV0CGoRdBu4A/1OxX+wwAtgBQcuVnNT3S4d4BVhnlZM1CzHkdqSqym4C
uDgEaO0QbhIKKwYBBAGXVQEFAQEHQHd2XZBmNM4x9H0vcFH7/9aJIoItDYX2YeCi
aGE6f/gGAwEIB4hhBBgWCAAJBYJo7RBuApsMAAoJEBlLsVz8cS6Y6KQA/2l9u7/g
4TbH9pRDEAIRpe+pJQNv6VQqne2H6p3oz1SSAP4+29EBMDQ4DzDxw+06hSLKjZWs
qzh6p+2OWW7rNfwHAw==
=BSh0
-----END PGP PUBLIC KEY BLOCK-----
소원 경비 비밀 무지개 박사 동화책 진로 경찰 소년 다이어트 군인 정원 
오직 자동 주먹 강제   독립 간접   변신 찻잔 음식 구별   이렇게 여름

이 키를 사용하여 저에게 암호화된 메시지를 보내거나, 저의 서명을 검증할 수 있습니다.


Experience

Rebellions Inc.

NPU Design Engineer & FPGA Emulation Lead

May 2021 - Present

rebellions
  • NPU Architecture & RDMA Design: Designing High-Performance NPU interconnects, specifically focusing on InfiniBand (IB) RDMA logic. Solving critical bottlenecks such as HBM direct access contention and designing Weighted Priority Arbiters to optimize bandwidth between Compute Units and Network traffic.

  • FPGA Emulation & Prototyping (Lead): Led the emulation of the ‘ATOM’ chip (1.55B gates). Overcame extreme routing congestion on Synopsys HAPS-100 (VU19P) and Xilinx U250 clusters by implementing custom SerDes logic and manual SLR partitioning.

  • Research & Optimization: First author of ICCAD 2025 paper (CTDM). Developed a resource-efficient FPGA simulation technique using Chain-based Time Division Multiplexing, significantly reducing LUT usage and enabling faster verification.

  • System DMA & Memory Architecture: Designed programmable System DMAs supporting 4-AXI Master SIMD operations and architected a 32MB On-Chip Memory system including cache coherency logic.

  • Co-Simulation Environment: Built a seamless VCS-FPGA co-simulation system to bridge the gap between pre-silicon verification and post-silicon validation.


Samsung Electronics

Simulation & Verification Engineer (System LSI)

Feb 2018 - Apr 2021

samsung
  • Automotive SoC Verification: Conducted rigorous simulation and performance analysis for Automotive SoCs (Lock-step & Split mode architectures) ensuring compliance with safety standards.

  • ARM Core Optimization: Optimized AMBA Bus interconnects and performed CPU/GPU simulations for Exynos Modems using ARM Cortex (Ananke) and Mali GPU architectures.

  • Emulator Acceleration: Migrated simulation environments from software-based models to Cadence Palladium accelerators, significantly reducing verification time for the S9 processor GPU (S5E9810).

  • DFT & Low-level Debugging: Handled DFT (Design for Testability) using Synopsys tools and performed deep-dive assembly level debugging for ARM ELF binaries.


Education

BS in Electrical and Electronics Engineering

Korea University
2011 - 2018

Activities: KUCC (Computer Club) C++ Lecturer

korea_unvi

High School

Incheon Science High School (ISHS)
2009 - 2010

Achievement: Early Graduation (2 years) | Informatics & Math Olympiad

ISHS

Publications & Articles

📄 [Paper] CTDM: Resource-Efficient FPGA-Accelerated Simulation of Large-Scale NPU Designs

Role: First Author | Venue: ICCAD 2025

Abstract This paper proposes a novel approach to accelerate large Neural Processing Unit (NPU) design simulations on FPGA through Chain-based Time-Division Multiplexing (CTDM) and its automatic compiler.

  • Key Innovation: CTDM replaces repeated logic patterns with single logic patterns and register chains, leveraging hardware-predefined shift register primitives. This minimizes logic overhead and routing congestion, reducing FPGA resource utilization more effectively than conventional multiplexer-based TDM.
  • Scalability & Compatibility: The automated compiler supports various HDLs (Verilog, VHDL, HLS, Chisel) and diverse hardware ranging from single boards to server-grade simulators like Synopsys Zebu. It also introduces a block interleaving technique to hide inter-FPGA link latency.
  • Results: When applied to NVIDIA’s NVDLA, CTDM achieved 66% LUT and 82% FF resource reduction, enabling full deployment on a single AMD U250 FPGA. This resulted in a 3,653x acceleration in simulation time compared to CPU-based VCS.
  • Real-World Application: Successfully implemented for the verification of a proprietary 4-die 1024 TFLOPS chiplet using 144 FPGAs on Zebu Server 5.

📄 [Paper] A Quad-Chiplet AI SoC with Full-Chip Scalable Mesh Over 16Gb/s UCIe-Advanced Die-to-Die Interface for Large-Scale AI Inferencing

Role: Co-author | Venue: ISSCC 2026

  • Source: 2026 IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, CA, USA

Abstract This paper presents a 4nm-based quad-chiplet LLM accelerator achieving 56.8TPS on LLaMA v3.3 70B. The architecture integrates low-latency UCIe-Advanced die-to-die interfaces, unified mixed-precision compute, and HBM3E with advanced power schemes to sustain the bandwidth and thermal stability required for large-scale AI inferencing.


📄 [Paper] ATOMUS: A 5nm 32TFLOPS/128TOPS ML System-on-Chip for Latency Critical Applications

Role: Co-author | Venue: ISSCC 2024

  • Source: 2024 IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, CA, USA

Abstract ATOMUS is a 5nm AI accelerator optimized for latency-critical applications such as high-frequency trading and SLO-based AI services. It delivers 32TFLOPS/128TOPS with outstanding single-stream responsiveness and low TDP, enabling efficient scale-out for both edge and datacenter/cloud platforms.


📰 [Article] Technical Reliability Issues in the Student Council Mobile Voting System

Role: Reporter (Korea University Newspaper) link news | [2015.11.23] Investigative Report | [2015.11.23]

Summary Authored an investigative article critiquing the mobile voting system used by the university student council. The report exposed significant security vulnerabilities and a lack of technical reliability in the system, raising concerns about potential election fraud and the integrity of the digital voting process.

1min

Hero

The Principle of Growth 조현제 Hyunje Jo Hardware Architect / Full-Stack Engineer Zaleph (ℵ0): Represents my pursuit of 'Countable Infinity'. 한정된 하드웨어 자원 위에서 무한한 논리적 가능성을 설계합니다.

1min

Education

Education BS in Electrical and Electronics Engineering Korea University 2011 - 2018 Activities: KUCC (Computer Club) C++ Lecturer High School Incheon Science High School (ISHS) …

2min

Experience

Experience Rebellions Inc. NPU Design Engineer & FPGA Emulation Lead May 2021 - Present NPU Architecture & RDMA Design: Designing High-Performance NPU interconnects, …

2min

Publication

Publications & Articles 📄 [Paper] CTDM: Resource-Efficient FPGA-Accelerated Simulation of Large-Scale NPU Designs Role: First Author | Venue: ICCAD 2025 Abstract This paper …

1min

Contact

Get in touch 다양한 방식으로 저와 연락할 수 있습니다. ✉️ Email a@hj4.it 💬 KakaoTalk aabbxx ✈️ Telegram @hj4it 📷 Instagram @acos99 PGP Public Key 안전한 통신을 위해 다음 PGP 공개 키를 사용합니다. -----BEGIN PGP PUBLIC …