Hi, my name is

Hyunje Jo.

NPU Architecture & RDMA Design Engineer | FPGA/Emulation Specialist

Bridging Hardware, System SW, and AI Infrastructure. I am a Full-Stack Hardware Architect designing the backbone of AI acceleration.

About Me

My domain is zaleph.com, derived from ‘Aleph-Null’ ($\aleph_0$)—the concept of countable infinity. Just as I pondered the correspondence between natural numbers and real numbers, I am driven by the infinite possibilities that arise when connecting disparate domains: Hardware to Software, and Chip Design to Infrastructure.

I am not just an RTL engineer. I am a Full-Stack Hardware Architect who understands the entire lifecycle of computing—from the transistor level to the cloud orchestration layer.

Core Competencies

  • NPU & RDMA Architecture: Currently designing the backbone of AI acceleration at Rebellions. My focus involves IB RDMA design, resolving HBM bandwidth contention, and optimizing Compute-Communication overlap for large-scale AI clusters.
  • World-Class FPGA Emulation: Successfully mapped a 1.55B gate Atom chip across multi-FPGA systems (HAPS, U250). I specialize in manual partitioning, routing congestion resolution, and creating custom TDM solutions (published in ICCAD 2025).
  • System & Infrastructure:* Unlike typical HW engineers, I operate my own private servers (Proxmox, ESXi, K8s) and develop full-stack web services. This experience allows me to design hardware that is truly “software-friendly” and “infrastructure-ready.”
Tech Stack
  • SystemVerilog / RTL
  • RDMA (InfiniBand)
  • NPU Architecture
  • FPGA (HAPS, ZeBu)
  • C++ / Python
  • Linux Kernel
  • Virtualization (ESXi, KVM)
  • Kubernetes / Docker

Experience

NPU Design Engineer & FPGA Emulation Lead - Rebellions Inc.
May 2021 - Present
  • NPU Architecture & RDMA Design: Designing High-Performance NPU interconnects, specifically focusing on InfiniBand (IB) RDMA logic. Solving critical bottlenecks such as HBM direct access contention and designing Weighted Priority Arbiters to optimize bandwidth between Compute Units and Network traffic.
  • FPGA Emulation & Prototyping (Lead): Led the emulation of the ‘ATOM’ chip (1.55B gates). Overcame extreme routing congestion on Synopsys HAPS-100 (VU19P) and Xilinx U250 clusters by implementing custom SerDes logic and manual SLR partitioning.
  • Research & Optimization: First author of ICCAD 2025 paper (CTDM). Developed a resource-efficient FPGA simulation technique using Chain-based Time Division Multiplexing, significantly reducing LUT usage and enabling faster verification.
  • System DMA & Memory Architecture: Designed programmable System DMAs supporting 4-AXI Master SIMD operations and architected a 32MB On-Chip Memory system including cache coherency logic.
  • Co-Simulation Environment: Built a seamless VCS-FPGA co-simulation system to bridge the gap between pre-silicon verification and post-silicon validation.
Simulation & Verification Engineer (System LSI) - Samsung Electronics
Feb 2018 - Apr 2021
  • Automotive SoC Verification: Conducted rigorous simulation and performance analysis for Automotive SoCs (Lock-step & Split mode architectures) ensuring compliance with safety standards.
  • ARM Core Optimization: Optimized AMBA Bus interconnects and performed CPU/GPU simulations for Exynos Modems using ARM Cortex (Ananke) and Mali GPU architectures.
  • Emulator Acceleration: Migrated simulation environments from software-based models to Cadence Palladium accelerators, significantly reducing verification time for the S9 processor GPU (S5E9810).
  • DFT & Low-level Debugging: Handled DFT (Design for Testability) using Synopsys tools and performed deep-dive assembly level debugging for ARM ELF binaries.

Education

2011 - 2018
BS in Electrical and Electronics Engineering
Korea University
Activities: KUCC (Computer Club) C++ Lecturer
2009 - 2011
High School
Incheon Science High School (ISHS)
Early Graduation (2 years). Informatics & Math Olympiad.

Key Projects

Rebellions Atom Chip Emulation
FPGA Synopsys HAPS Verilog
Rebellions Atom Chip Emulation
Successfully mapped a 1.55B gate Atom chip onto a multi-FPGA platform. Solved extreme routing congestion via manual partitioning and custom HSTDM optimization.
Home Lab & Private Cloud
Proxmox Kubernetes Linux Kernel Self-Hosting
Home Lab & Private Cloud
Operating a private cloud using Proxmox and ESXi. Configured GPU passthrough for ML workloads and hosting self-managed services (Gitlab, Wiki, Immich) on Kubernetes.
Memkey (AI Keyboard Assistant)
Full-Stack NLP Android
Memkey (AI Keyboard Assistant)
Samsung C-Lab 2nd Place Winner. Developed backend API (PHP) and implemented keyword analysis using KoNLPy for Android-based recommendation keyboard.

Achievements & Publications

Samsung C-Lab Award (2nd Place)
Awarded for 'Memkey' project - An AI-based emoticon recommendation keyboard. Developed Backend API and NLP logic.

Contact Me

Interested in NPU Architecture, FPGA Prototyping, or Home Lab setups? Let’s connect!