»

GPU kernels engineer -- brisan oglas

Opis delovnega mesta

At Soniox, we build cutting-edge real-time AI systems, and we push every layer of the stack to its limit. As a GPU kernels engineer, you'll be at the heart of this effort: writing custom high-performance kernels that unlock the full potential of modern hardware and fuel massive-scale training and inference workloads.

You'll work across platform, research, and systems teams to accelerate our most demanding training jobs and shape model architectures around the physical realities of GPUs. If you care deeply about register pressure, tensor core utilization, warp shuffle efficiency, and squeezing every last FLOP from memory bandwidth, this role is for you.

Od kandidatov zahtevamo

In this role, you will:
- Design and implement custom GPU/CPU kernels to maximize hardware throughput and efficiency.
- Optimize for HBM utilization, instruction issue rate, cache locality, and memory bandwidth.
- Collaborate with platform and infra teams to integrate and deploy kernels at scale.
- Develop low-precision kernels and quantization-aware techniques to reduce compute without compromising ML accuracy.
- Partner with ML engineers to co-design model architectures optimized for real-time training and inference.
- Work directly with hardware vendors to advise on architecture direction and co-design opportunities.

You might thrive in this role if you:
- Write excellent C/C++ and Python code and enjoy writing fast, clean low-level systems.
- Have deep understanding of GPU (especially CUDA), CPU, or AI accelerator architectures.
- Know how to optimize every part of a compute kernel, from memory layout to instruction scheduling.
- Have experience working on large-scale ML training infrastructure, ideally for LLMs or real-time AI models.
- Are skilled in quantization and low-precision computation for modern ML workloads.
- Thrive on performance benchmarks and obsess over every percentage point of speedup.
- Have 3+ years of experience in high-performance computing, ML infra, or systems-level optimization.

Kandidatom ponujamo

What we offer
- The chance to work on foundational AI that redefines how humans and machines communicate.
- Global impact: your work will touch millions (and soon billions) of people across languages and cultures.
- End-to-end ownership in a lean, engineering-driven team with no bureaucracy.
- Collaboration with world-class talent in research, engineering, and product.
- A fast-growing startup environment where you shape both the technology and the company's future.
- Competitive compensation with equity ownership.
- Flexible work setup with emphasis on in-person collaboration.
- Regular team events, offsites, and a strong learning-driven culture.

Klasifikacija delovnega mesta

Lokacija:
Ljubljana
Plačilo:
€3000 - €6000 gross and equity, plus performance bonus EUR / mesec
Delovni čas:
redna zaposlitev

 

Zahtevana znanja

Design and implement custom GPU/CPU kernels to maximize hardware throughput and efficiency.
napredno znanje