Zaposlitveni oglasi » Software engineer, GPU inference
Software engineer, GPU inference @ Soniox d.o.o.
- objavljeno ::
Opis delovnega mesta
Soniox is pushing the boundaries of real-time speech AI, and we're looking for an engineer to help us scale the world's most advanced language models across a low-latency, high-throughput, production-grade inference stack.
In this role, you'll work at the intersection of deep learning, systems engineering, and performance optimization, helping us squeeze every FLOP out of our GPUs, reduce latency to the millisecond, and keep our systems running at global scale.
Od kandidatov zahtevamo
In this role, you will:
- Work closely with researchers, engineers, and product teams to bring cutting-edge AI models into real-world production.
- Architect and optimize our inference infrastructure to deliver low-latency, high-reliability performance across thousands of concurrent requests.
- Identify and eliminate system bottlenecks, improving throughput and GPU utilization across the fleet.
- Introduce and implement tools and techniques to monitor, debug, and improve model inference at scale.
- Tune our VM fleet to maximize compute, memory, and network efficiency down to the last GPU cycle.
- Support advanced research workflows by building robust, scalable systems that enable rapid experimentation.
You might thrive in this role if you:
- Have a strong intuition for optimizing modern ML architectures for inference performance.
- Are deeply familiar with PyTorch, CUDA, NCCL, and GPU internals, or excited to become an expert quickly.
- Understand HPC fundamentals and have worked with technologies like InfiniBand, NVLink, or MPI.
- Have experience building and scaling distributed systems in production, ideally performance-critical ones.
- Have rebuilt or refactored systems due to 10x+ scale increases and know what to watch out for.
- Are a self-starter who thrives in fast-moving environments and finds clarity amidst ambiguity.
- Care about reliability, simplicity, performance, and take ownership from design to deployment.
- Have at least 5 years of professional software engineering experience.
Kandidatom ponujamo
What we offer
- The chance to work on foundational AI that redefines how humans and machines communicate.
- Global impact: your work will touch millions (and soon billions) of people across languages and cultures.
- End-to-end ownership in a lean, engineering-driven team with no bureaucracy.
- Collaboration with world-class talent in research, engineering, and product.
- A fast-growing startup environment where you shape both the technology and the company's future.
- Competitive compensation with equity ownership.
- Flexible work setup with emphasis on in-person collaboration.
- Regular team events, offsites, and a strong learning-driven culture.
Kontakt
https://soniox.com/careers/software-engineer-gpu-inference
Klasifikacija delovnega mesta
- Lokacija:
- Ljubljana
- Plačilo:
- €3000 - €6000 gross and equity, plus performance bonus EUR / mesec
- Delovni čas:
- redna zaposlitev
Zahtevana znanja
- PyTorch, CUDA, NCCL, and GPU internals.
- napredno znanje
O podjetju
Soniox is building the world's most advanced real-time conversational AI platform: designed to understand every conversation, in every language, anywhere. Our technology goes beyond speech-to-text: we deliver low-latency transcription, translation, and reasoning across 60+ languages, enabling businesses and developers to build next-generation products powered by voice.
We are innovating across the full stack of foundational AI for speech: from large-scale data acquisition and unsupervised dataset generation, to novel model architectures, training methodologies, and optimized inference engines. This holistic approach gives Soniox unmatched accuracy and efficiency compared to Google, OpenAI Whisper, AWS, and others.
Our API powers a wide range of applications: from live translation and meeting assistants to healthcare, call centers, enterprise productivity, and accessibility solutions.
Soniox is a fast-growing startup backed by global enterprise partners. Our engineering and product development hub is in Ljubljana, Slovenia, and we are expanding internationally. Our mission is simple but ambitious: make voice AI work for 8 billion people.
Zakaj bi želel delati za vas
At Soniox you don't just work with AI, you create the AI that the world depends on:
- Foundation of communication: Not just another AI company, we're building how the world will talk to machines and each other.
- True innovation: We invent at every layer, data, model architecture, training, and inference. Not just wrapping someone else's model.
- Global impact: Our mission is voice AI for 8 billion people, including regions that never had reliable AI before.
- State of the art: We beat Google, OpenAI Whisper, AWS, and others. You'll work on the very frontier.
- Ownership & speed: Lean, fast, engineering-driven. You ship real innovations directly to users without red tape.
- World-class team: Our hub in Ljubljana, Slovenia is packed with talent obsessed with breakthrough AI. You'll work closely with founders and researchers.
- Generational opportunity: Speech is the next computing platform. Join now to help shape that future.
Programerski vprašalnik
- Uporabljamo programsko opremo za nadzor izvorne kode (source control)
- Uporabljamo rešitev za spremljanje baze napak (bug database)
- Uporabljamo najboljša orodja, ki se jih dobi na trgu
- Obstaja terminski načrt razvoja
- Programiramo skladno s pisno specifikacijo
- Napake odpravimo pred pisanjem nove kode
- Zaposlene imamo beta-testerje
- Unit testing
- Zaposleni imajo mirno delovno okolje
- Iskalci zaposlitve na intervjujih programirajo
- Zaposlenim vsaj enkrat na dan zagotavljamo topel obrok
- Zaposlenim zagotavljamo prostor za malice
- Zaposlenim nudimo sprostitvene aktivnosti zunaj delovnega časa
- Zaposlenim zagotavljamo parkirno mesto